One important implication is that collateral freedom techniques [1] using Amazon S3 will no longer work.
To put it simply, right now I could put some stuff not liked by Russian or Chinese government (maybe entire website) and give a direct s3 link to https:// s3 .amazonaws.com/mywebsite/index.html. Because it's https — there is no way man in the middle knows what people read on s3.amazonaws.com. With this change — dictators see my domain name and block requests to it right away.
I don't know if they did it on purpose or just forgot about those who are less fortunate in regards to access to information, but this is a sad development.
This censorship circumvention technique is actively used in the wild and loosing Amazon is no good.
If there is anyone from Amazon caring about freedom of speech and censorship — please contact me at [email protected], I'd love to give you more perspective on this.
https://en.wikipedia.org/wiki/Domain_fronting#Disabling Interestingly a different although related trick (that of domain fronting) has been blocked last year "by both Google and Amazon.... in part due to pressure from the Russian government over Telegram domain fronting activity using both of the cloud providers' services."
Just as a counter argument, one of the things we tried to do at a previous employer was data exfiltration protection. This meant using outbound proxies from our networks to reach pre-approved urls and we don't want to mitm the TLS connections. This leaves a bit of a problem, because we don't want to whitelist all of s3, the defeats the purpose, so we had to mandate using the bucket.s3 uri style, which is a bit of a pain for clients that use the direct s3 link style, but then we could whitelist buckets we control.
I don't want to say this use case is more important, but I can see the merits of standardizing on the subdomain style, and that this might be a common ask of amazon.
Google Reader served a similar purpose. People used its social features for communication since (the thinking went) governments weren't going to block Google.
Collateral freedom doesn't work in China. China has already blocked or throttled (hard to tell which, since GFW doesn't announce it) connections to AWS S3 for years.
Definitely worth checking out https://ipfs.io/. Even for those who don't or can't run IPFS peers on their own devices, IPFS gateways can fill much of the same purpose you listed above. Additionally, the same content should be viewable through _any_ gateway. Meaning if a Gateway provider ever amazoned you, you simply make the requests through a new gateway.
Use DNS over HTTP. Firefox is very easily configurable (network.trr.bootstrapAddress, network.trr.mode, etc) so that if you pick the right bootstrap provider and DNS over HTTP provider you'll never send an unencrypted DNS query (including no SNIs) and it will fail completely rather than reverting to your OS's DNS Client if it cannot be resolved via the DNS over HTTP channel you define.
Because the S3 buckets are virtual-hosted they share IPs so there is deniability if you can hide the DNS/SNI.
"collateral freedom" is a failed concept. Many years ago people use Gmail to communicate, and they argue that Chinese government won't dare to block such important and neutral service.
There are like <1% websites in China relies on S3 to deliver static files. Blocking AWS as a whole has happened before. There is simply no freedom was "collateral". Freedom has to be fought hard and eared.
"right now I could put some stuff not liked by Russian or Chinese government (maybe entire website) and give a direct s3 link to https:// s3 .amazonaws.com/mywebsite/index.html. Because it's https — there is no way man in the middle knows what people read on s3.amazonaws.com."
Chinese government will just ban the whole s3.amazonaws.com domain. Same as facebook.com, youtube.com, google.com, gmail.com, wikipedia...
However letting them banning sub-domains will actually make S3 a useable service in China. It's a huge step forward.
There are so, SO many teams that use S3 for static assets, make sure it's public, and copy that Object URL. We've done this at my company, and I've seen these types of links in many of our partners' CSS files. These links may also be stored deep in databases, or even embedded in Markdown in databases.
This will quite literally cause a Y2K-level event, and since all that traffic will still head to S3's servers, it won't even solve any of their routing problems.
Set it as a policy for new buckets, if you must, if you change the Object URL output and have a giant disclaimer.
Also in millions of manuals, generated PDFs, sent emails... Some things you just can't "update" anymore..
It's really disastrous change for the web data integrity.
Came here for the same comment. I setup some s3 related stuff less than 2 months ago and the documentation, at least for the js sdk, still recommends the path-style url. I don't even recall a V1/V2 mentioned.
That seems very inconvenient, and is pretty inline with my experience with aws: I guess their services are cheap and good, but oh boy! The developer experience is SO bad.
- So many services, it is very hard to know what to use for what
- Complex and not user friendly APIs
- coming with terrible documentation
I'm pretty sure they'd get a lot more business if they invested a bit more in developer friendliness - right now I only use aws if a Client really insists on it, because despite having used it a fair amount, I'm still not happy and comfortable with it.
I think recent years have proven that despite all the memeing about things like rest, "don't break the web" is not a value shared by all of the parties involved.
The takeaway is that for those of us that do still wish to uphold those values, we can let this serve as a lesson that we should not publish assets behind urls we don't control.
And, of course, this ruins https. Amazon has you covered for * .s3.amazonaws.com, but not for * .* .s3.amazonaws.com or even * .* .* .s3.amazonaws... and so on.
So... I guess I have to rename/move all my buckets now? Ugh.
> The name of the bucket used for Amazon S3 Transfer Acceleration must be DNS-compliant and must not contain periods (".").
and as you mentioned
> When you use virtual hosted–style buckets with Secure Sockets Layer (SSL), the SSL wildcard certificate only matches buckets that don't contain periods. To work around this, use HTTP or write your own certificate verification logic. We recommend that you do not use periods (".") in bucket names when using virtual hosted–style buckets.
AWS Docs have always been a mess of inconsistencies so this isn't a big surprise. I dealt with similar naming issues when setting up third-party CDNs since ideally Edges would cache using a HTTPS connection to Origin. IIRC the fix was to use path-style, but now with the deprecation it'd need a full migration.
Wonder how CloudFront works around it. Maybe it special cases it and uses the S3 protocol instead of HTTP/S.
FWIW - I found it fairly trivial to set up CloudFront in front of my buckets [1], so that I can use HTTPS with AWS Cert Mgr (ACM) to serve our s3 sites on https://mydomain.com [2].
I set this up some time ago using our domain name and ACM, and I don't think I will need to change anything in light of this announcement.
isn't that domain name style bucket naming only for hosting a static website from an s3 bucket? otherwise, you can name the bucket whatever you want within the rest of the naming rules.
The point of that is solely for doing website hosting with S3 though - where you'll have a CNAME. Why would you name a bucket that way if you're not using it for the website hosting feature?
Does anyone have insight on why they're making this change? All they say in this post is "In our effort to continuously improve customer experience". From my point of view as a customer, I don't really see an experiential difference between a subdomain style and a path style - one's a ".", the other's a "/" - but I imagine there's a good reason for the change.
First to allow them to shard more effectively. With different subdomains, they can route requests to various different servers with DNS.
Second, it allows them to route you directly to the correct region the bucket lives in, rather than having to accept you in any region and re-route.
Third, to ensure proper separation between websites by making sure their origins are separate. This is less AWS's direct concern and more of a best practice, but doesn't hurt.
I'd say #2 is probably the key reason and perhaps #1 to a lesser extent. Actively costs them money to have to proxy the traffic along.
Currently all buckets share a domain and therefore share cookies. I've seen attacks (search for cookie bomb + fallback manifest) that leverage shared cookies to allow an attacker to exfiltrate data from other buckets
The only obvious thing that occurs to me is that bringing the bucket into the domain name puts it under the same-origin policy in the browser security model. Perhaps there are a significant number of people hosting their buckets and compromising security this way? Not something I have heard of but it seems possible. Makes me wonder if they are specifically not mentioning it because this is the reason and they know there are vulnerable applications in the wild and they don't want to draw attention to it?
Probably because they want to improve the response time with a more precise DNS answer.
With s3.amazonaws.com, they need to have a proxy near you that download the content from the real region.
With yourbucket.s3.amazonaws.com, they can give an IP of an edge in the same region as your bucket.
Does the "you are no longer logged in" screen not infuriate anyone besides me? There doesn't seem any purpose to it just redirecting you to the landing page when you were trying to access a forum post that doesn't even require you be logged in.
Absolutely mind boggling with as much as they pay people they do something so stupid and haven't changed it after so long.
I wonder how they’ll handle capitalized bucket names. This seems like it will break that.
S3 has been around a long time, and they made some decisions early on that they realised wouldn’t scale, so they reversed them. This v1 vs v2 url thing is one of them.
But another was letting you have “BucketName” and “bucketname” as two distinct buckets. You can’t name them like that today, but you could at first, and they still work (and are in conflict under v2 naming).
Amazons own docs explain that you still need to use the old v1 scheme for capitalized names, as well as names containing certain special characters.
It’d be a shame if they just tossed all those old buckets in the trash by leaving them inaccessible.
All in, this seems like another silly, unnecessary, depreciation of an API that was working perfectly well. A trend I’m noticing more often these days.
One of the weird peculiarities of path-style API requests was that it meant CORS headers meant nothing for any bucket pretty much. I wrote a post about this a bit ago [0].
I guess after this change, the cors configuration will finally do something!
On the flip side, anyone who wants to list buckets entirely from the client-side javascript sdk won't be able to anymore unless Amazon also modifies cors headers on the API endpoint further after disabling path-style requests.
Amazon is proud that they never break backwards compatibility like this. Quotes like the container you are running on Fargate will keep running 10 years from now.
Something weird is going on if they don’t keep path style domains working for existing buckets.
I was already planning a move to GCP, but this certainly helps. Now that cloud is beating retail in earnings, the ‘optimizations’ come along with it. That and BigQuery is an amazing tool.
It’s not like I’m super outraged that they would change their API, the reasoning seems sound. It’s just that if I have to touch S3 paths everywhere I may as well move them elsewhere to gain some synergies with GCP services. I would think twice if I were heavy up on IAM roles and S3 Lambda triggers, but that isn’t the case.
This is most likely to help mitigate the domain being abused for browser security due to the same-origin policy. This is very common when dealing with malware, phishing, and errant JS files.
`In our effort to continuously improve customer experience` , what's the actual driver here, I don't see how going from two to one option and forcing you to change if you are in the wrong one improves my experience.
I see a problem when using the s3 library to other services that support s3 but only have some kind of path style access like minio or ceph with no subdomains enabled. it will break once their java api removes the old code.
AWS API is an inconsistent mess. If you don't believe me try writing a script to tag resources. Every resource type requires using different way to identify it, different way to pass the tags etc. You're pretty much required to write different code to handle each resource type.
[+] [-] samat|7 years ago|reply
To put it simply, right now I could put some stuff not liked by Russian or Chinese government (maybe entire website) and give a direct s3 link to https:// s3 .amazonaws.com/mywebsite/index.html. Because it's https — there is no way man in the middle knows what people read on s3.amazonaws.com. With this change — dictators see my domain name and block requests to it right away.
I don't know if they did it on purpose or just forgot about those who are less fortunate in regards to access to information, but this is a sad development.
This censorship circumvention technique is actively used in the wild and loosing Amazon is no good.
1 https://en.wikipedia.org/wiki/Collateral_freedom
[+] [-] samat|7 years ago|reply
[+] [-] MichaelMoser123|7 years ago|reply
[+] [-] kevin_nisbet|7 years ago|reply
Just as a counter argument, one of the things we tried to do at a previous employer was data exfiltration protection. This meant using outbound proxies from our networks to reach pre-approved urls and we don't want to mitm the TLS connections. This leaves a bit of a problem, because we don't want to whitelist all of s3, the defeats the purpose, so we had to mandate using the bucket.s3 uri style, which is a bit of a pain for clients that use the direct s3 link style, but then we could whitelist buckets we control.
I don't want to say this use case is more important, but I can see the merits of standardizing on the subdomain style, and that this might be a common ask of amazon.
[+] [-] KeenFox|7 years ago|reply
[+] [-] paulddraper|7 years ago|reply
AWS S3 will only provide SSL validation if your bucket name happens to not contain "."
Which is a practice encouraged by AWS. [1]
So anyone that has www.example.com as the bucket name can no longer use HTTPS.
[1] https://docs.aws.amazon.com/AmazonS3/latest/dev/website-host...
[+] [-] netheril96|7 years ago|reply
[+] [-] andrewxhill|7 years ago|reply
[+] [-] Someone1234|7 years ago|reply
Because the S3 buckets are virtual-hosted they share IPs so there is deniability if you can hide the DNS/SNI.
[+] [-] djsumdog|7 years ago|reply
[+] [-] morpheuskafka|7 years ago|reply
[1] https://blog.cloudflare.com/encrypted-sni/
[+] [-] est|7 years ago|reply
There are like <1% websites in China relies on S3 to deliver static files. Blocking AWS as a whole has happened before. There is simply no freedom was "collateral". Freedom has to be fought hard and eared.
[+] [-] Thaxll|7 years ago|reply
Edit: Why am I getting downvoted, it's a legit answer, CDN hides your origin.
[+] [-] _pmf_|7 years ago|reply
[+] [-] supergirl|7 years ago|reply
[deleted]
[+] [-] randomguy9839|7 years ago|reply
Chinese government will just ban the whole s3.amazonaws.com domain. Same as facebook.com, youtube.com, google.com, gmail.com, wikipedia...
However letting them banning sub-domains will actually make S3 a useable service in China. It's a huge step forward.
[+] [-] xiaq|7 years ago|reply
[+] [-] btown|7 years ago|reply
https://www.dropbox.com/s/zzr3r1nvmx6ekct/Screenshot%202019-...
There are so, SO many teams that use S3 for static assets, make sure it's public, and copy that Object URL. We've done this at my company, and I've seen these types of links in many of our partners' CSS files. These links may also be stored deep in databases, or even embedded in Markdown in databases.
This will quite literally cause a Y2K-level event, and since all that traffic will still head to S3's servers, it won't even solve any of their routing problems.
Set it as a policy for new buckets, if you must, if you change the Object URL output and have a giant disclaimer.
But don't. Freaking. Break. The. Web.
[+] [-] EugeneOZ|7 years ago|reply
[+] [-] jypepin|7 years ago|reply
That seems very inconvenient, and is pretty inline with my experience with aws: I guess their services are cheap and good, but oh boy! The developer experience is SO bad.
- So many services, it is very hard to know what to use for what - Complex and not user friendly APIs - coming with terrible documentation
I'm pretty sure they'd get a lot more business if they invested a bit more in developer friendliness - right now I only use aws if a Client really insists on it, because despite having used it a fair amount, I'm still not happy and comfortable with it.
[+] [-] parktheredcar|7 years ago|reply
The takeaway is that for those of us that do still wish to uphold those values, we can let this serve as a lesson that we should not publish assets behind urls we don't control.
[+] [-] hartleybrody|7 years ago|reply
[1] https://www.w3.org/Provider/Style/URI
[+] [-] astrocat|7 years ago|reply
Now, it seems, this is a big problem. V2 resource requests will look like this: https://example.com.s3.amazonaws.com/... or https://www.example.com.s3.amazonaws.com/...
And, of course, this ruins https. Amazon has you covered for * .s3.amazonaws.com, but not for * .* .s3.amazonaws.com or even * .* .* .s3.amazonaws... and so on.
So... I guess I have to rename/move all my buckets now? Ugh.
[+] [-] Hedja|7 years ago|reply
e.g.
> The name of the bucket used for Amazon S3 Transfer Acceleration must be DNS-compliant and must not contain periods (".").
and as you mentioned
> When you use virtual hosted–style buckets with Secure Sockets Layer (SSL), the SSL wildcard certificate only matches buckets that don't contain periods. To work around this, use HTTP or write your own certificate verification logic. We recommend that you do not use periods (".") in bucket names when using virtual hosted–style buckets.
AWS Docs have always been a mess of inconsistencies so this isn't a big surprise. I dealt with similar naming issues when setting up third-party CDNs since ideally Edges would cache using a HTTPS connection to Origin. IIRC the fix was to use path-style, but now with the deprecation it'd need a full migration.
Wonder how CloudFront works around it. Maybe it special cases it and uses the S3 protocol instead of HTTP/S.
[+] [-] thayne|7 years ago|reply
It's worse than that. You can't rename a bucket. You will have to create a new bucket and copy everything over.
[+] [-] dfsegoat|7 years ago|reply
I set this up some time ago using our domain name and ACM, and I don't think I will need to change anything in light of this announcement.
1 - https://docs.aws.amazon.com/AmazonS3/latest/dev/website-host...
2 - https://docs.aws.amazon.com/acm/latest/userguide/acm-overvie...
[+] [-] the_mitsuhiko|7 years ago|reply
[+] [-] dylan604|7 years ago|reply
[+] [-] BillinghamJ|7 years ago|reply
[+] [-] TheLoneTechNerd|7 years ago|reply
[+] [-] BillinghamJ|7 years ago|reply
First to allow them to shard more effectively. With different subdomains, they can route requests to various different servers with DNS.
Second, it allows them to route you directly to the correct region the bucket lives in, rather than having to accept you in any region and re-route.
Third, to ensure proper separation between websites by making sure their origins are separate. This is less AWS's direct concern and more of a best practice, but doesn't hurt.
I'd say #2 is probably the key reason and perhaps #1 to a lesser extent. Actively costs them money to have to proxy the traffic along.
[+] [-] Gasparila|7 years ago|reply
Currently all buckets share a domain and therefore share cookies. I've seen attacks (search for cookie bomb + fallback manifest) that leverage shared cookies to allow an attacker to exfiltrate data from other buckets
[+] [-] zmmmmm|7 years ago|reply
[+] [-] chtitux|7 years ago|reply
With s3.amazonaws.com, they need to have a proxy near you that download the content from the real region. With yourbucket.s3.amazonaws.com, they can give an IP of an edge in the same region as your bucket.
[+] [-] sl1ck731|7 years ago|reply
Absolutely mind boggling with as much as they pay people they do something so stupid and haven't changed it after so long.
[+] [-] cddotdotslash|7 years ago|reply
Edit: Could they have found a better place to announce this than a forum post?
[+] [-] jasonkester|7 years ago|reply
S3 has been around a long time, and they made some decisions early on that they realised wouldn’t scale, so they reversed them. This v1 vs v2 url thing is one of them.
But another was letting you have “BucketName” and “bucketname” as two distinct buckets. You can’t name them like that today, but you could at first, and they still work (and are in conflict under v2 naming).
Amazons own docs explain that you still need to use the old v1 scheme for capitalized names, as well as names containing certain special characters.
It’d be a shame if they just tossed all those old buckets in the trash by leaving them inaccessible.
All in, this seems like another silly, unnecessary, depreciation of an API that was working perfectly well. A trend I’m noticing more often these days.
Shame.
[+] [-] euank|7 years ago|reply
I guess after this change, the cors configuration will finally do something!
On the flip side, anyone who wants to list buckets entirely from the client-side javascript sdk won't be able to anymore unless Amazon also modifies cors headers on the API endpoint further after disabling path-style requests.
[0]: https://euank.com/2018/11/12/s3-cors-pfffff.html
[+] [-] chillaxtian|7 years ago|reply
This could be just as disruptive.
Difficult to say that they will actually follow through, as the only mention of this date is in the random forum post I linked.
[+] [-] ec109685|7 years ago|reply
Something weird is going on if they don’t keep path style domains working for existing buckets.
[+] [-] sly010|7 years ago|reply
Edit: autotypo
[+] [-] bagels|7 years ago|reply
[+] [-] reilly3000|7 years ago|reply
It’s not like I’m super outraged that they would change their API, the reasoning seems sound. It’s just that if I have to touch S3 paths everywhere I may as well move them elsewhere to gain some synergies with GCP services. I would think twice if I were heavy up on IAM roles and S3 Lambda triggers, but that isn’t the case.
[+] [-] manigandham|7 years ago|reply
[+] [-] lazyant|7 years ago|reply
[+] [-] geekrax|7 years ago|reply
[+] [-] merb|7 years ago|reply
[+] [-] pulkitsh1234|7 years ago|reply
[+] [-] Roark66|7 years ago|reply
[+] [-] mark242|7 years ago|reply