Amazon S3 will no longer support path-style API requests

[+] samat|7 years ago|reply

One important implication is that collateral freedom techniques [1] using Amazon S3 will no longer work.

To put it simply, right now I could put some stuff not liked by Russian or Chinese government (maybe entire website) and give a direct s3 link to https:// s3 .amazonaws.com/mywebsite/index.html. Because it's https — there is no way man in the middle knows what people read on s3.amazonaws.com. With this change — dictators see my domain name and block requests to it right away.

I don't know if they did it on purpose or just forgot about those who are less fortunate in regards to access to information, but this is a sad development.

This censorship circumvention technique is actively used in the wild and loosing Amazon is no good.

1 https://en.wikipedia.org/wiki/Collateral_freedom

[+] samat|7 years ago|reply

If there is anyone from Amazon caring about freedom of speech and censorship — please contact me at [email protected], I'd love to give you more perspective on this.

[+] MichaelMoser123|7 years ago|reply

https://en.wikipedia.org/wiki/Domain_fronting#Disabling Interestingly a different although related trick (that of domain fronting) has been blocked last year "by both Google and Amazon.... in part due to pressure from the Russian government over Telegram domain fronting activity using both of the cloud providers' services."

[+] kevin_nisbet|7 years ago|reply

This is an interesting perspective.

Just as a counter argument, one of the things we tried to do at a previous employer was data exfiltration protection. This meant using outbound proxies from our networks to reach pre-approved urls and we don't want to mitm the TLS connections. This leaves a bit of a problem, because we don't want to whitelist all of s3, the defeats the purpose, so we had to mandate using the bucket.s3 uri style, which is a bit of a pain for clients that use the direct s3 link style, but then we could whitelist buckets we control.

I don't want to say this use case is more important, but I can see the merits of standardizing on the subdomain style, and that this might be a common ask of amazon.

[+] KeenFox|7 years ago|reply

Google Reader served a similar purpose. People used its social features for communication since (the thinking went) governments weren't going to block Google.

[+] paulddraper|7 years ago|reply

IMO the even bigger problem is that this literally breaks HTTPS.

AWS S3 will only provide SSL validation if your bucket name happens to not contain "."

Which is a practice encouraged by AWS. [1]

So anyone that has www.example.com as the bucket name can no longer use HTTPS.

[1] https://docs.aws.amazon.com/AmazonS3/latest/dev/website-host...

[+] netheril96|7 years ago|reply

Collateral freedom doesn't work in China. China has already blocked or throttled (hard to tell which, since GFW doesn't announce it) connections to AWS S3 for years.

[+] andrewxhill|7 years ago|reply

Definitely worth checking out https://ipfs.io/. Even for those who don't or can't run IPFS peers on their own devices, IPFS gateways can fill much of the same purpose you listed above. Additionally, the same content should be viewable through _any_ gateway. Meaning if a Gateway provider ever amazoned you, you simply make the requests through a new gateway.

[+] Someone1234|7 years ago|reply

Use DNS over HTTP. Firefox is very easily configurable (network.trr.bootstrapAddress, network.trr.mode, etc) so that if you pick the right bootstrap provider and DNS over HTTP provider you'll never send an unencrypted DNS query (including no SNIs) and it will fail completely rather than reverting to your OS's DNS Client if it cannot be resolved via the DNS over HTTP channel you define.

Because the S3 buckets are virtual-hosted they share IPs so there is deniability if you can hide the DNS/SNI.

[+] djsumdog|7 years ago|reply

This is similar to domain fronting, which many providers are no longer allowing either.

[+] morpheuskafka|7 years ago|reply

Would encrypted SNI fix this?

[1] https://blog.cloudflare.com/encrypted-sni/

[+] est|7 years ago|reply

"collateral freedom" is a failed concept. Many years ago people use Gmail to communicate, and they argue that Chinese government won't dare to block such important and neutral service.

There are like <1% websites in China relies on S3 to deliver static files. Blocking AWS as a whole has happened before. There is simply no freedom was "collateral". Freedom has to be fought hard and eared.

[+] Thaxll|7 years ago|reply

Use Cloudflare or any free CDN service?

Edit: Why am I getting downvoted, it's a legit answer, CDN hides your origin.

[+] _pmf_|7 years ago|reply

I think this is exactly what happened.

[+] supergirl|7 years ago|reply

[deleted]

[+] randomguy9839|7 years ago|reply

"right now I could put some stuff not liked by Russian or Chinese government (maybe entire website) and give a direct s3 link to https:// s3 .amazonaws.com/mywebsite/index.html. Because it's https — there is no way man in the middle knows what people read on s3.amazonaws.com."

Chinese government will just ban the whole s3.amazonaws.com domain. Same as facebook.com, youtube.com, google.com, gmail.com, wikipedia...

However letting them banning sub-domains will actually make S3 a useable service in China. It's a huge step forward.

[+] xiaq|7 years ago|reply

You cannot solve a political problem with a technical solution.

[+] btown|7 years ago|reply

What kind of company deprecates a URL format that's still recommended by the Object URL in the S3 Management Console?

https://www.dropbox.com/s/zzr3r1nvmx6ekct/Screenshot%202019-...

There are so, SO many teams that use S3 for static assets, make sure it's public, and copy that Object URL. We've done this at my company, and I've seen these types of links in many of our partners' CSS files. These links may also be stored deep in databases, or even embedded in Markdown in databases.

This will quite literally cause a Y2K-level event, and since all that traffic will still head to S3's servers, it won't even solve any of their routing problems.

Set it as a policy for new buckets, if you must, if you change the Object URL output and have a giant disclaimer.

But don't. Freaking. Break. The. Web.

[+] EugeneOZ|7 years ago|reply

Also in millions of manuals, generated PDFs, sent emails... Some things you just can't "update" anymore.. It's really disastrous change for the web data integrity.

[+] jypepin|7 years ago|reply

Came here for the same comment. I setup some s3 related stuff less than 2 months ago and the documentation, at least for the js sdk, still recommends the path-style url. I don't even recall a V1/V2 mentioned.

That seems very inconvenient, and is pretty inline with my experience with aws: I guess their services are cheap and good, but oh boy! The developer experience is SO bad.

- So many services, it is very hard to know what to use for what - Complex and not user friendly APIs - coming with terrible documentation

I'm pretty sure they'd get a lot more business if they invested a bit more in developer friendliness - right now I only use aws if a Client really insists on it, because despite having used it a fair amount, I'm still not happy and comfortable with it.

[+] parktheredcar|7 years ago|reply

I think recent years have proven that despite all the memeing about things like rest, "don't break the web" is not a value shared by all of the parties involved.

The takeaway is that for those of us that do still wish to uphold those values, we can let this serve as a lesson that we should not publish assets behind urls we don't control.

[+] hartleybrody|7 years ago|reply

Agreed. Cool URIs don't change. [1]

[1] https://www.w3.org/Provider/Style/URI

[+] astrocat|7 years ago|reply

Amazon explicitly recommends naming buckets like "example.com" and "www.example.com" : https://docs.aws.amazon.com/AmazonS3/latest/dev/website-host...

Now, it seems, this is a big problem. V2 resource requests will look like this: https://example.com.s3.amazonaws.com/... or https://www.example.com.s3.amazonaws.com/...

And, of course, this ruins https. Amazon has you covered for * .s3.amazonaws.com, but not for * .* .s3.amazonaws.com or even * .* .* .s3.amazonaws... and so on.

So... I guess I have to rename/move all my buckets now? Ugh.

[+] Hedja|7 years ago|reply

That's an interesting contradiction to the rest of their docs. Their docs in other place repeatedly state using periods "." will cause issues. https://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestri...

e.g.

> The name of the bucket used for Amazon S3 Transfer Acceleration must be DNS-compliant and must not contain periods (".").

and as you mentioned

> When you use virtual hosted–style buckets with Secure Sockets Layer (SSL), the SSL wildcard certificate only matches buckets that don't contain periods. To work around this, use HTTP or write your own certificate verification logic. We recommend that you do not use periods (".") in bucket names when using virtual hosted–style buckets.

AWS Docs have always been a mess of inconsistencies so this isn't a big surprise. I dealt with similar naming issues when setting up third-party CDNs since ideally Edges would cache using a HTTPS connection to Origin. IIRC the fix was to use path-style, but now with the deprecation it'd need a full migration.

Wonder how CloudFront works around it. Maybe it special cases it and uses the S3 protocol instead of HTTP/S.

[+] thayne|7 years ago|reply

> So... I guess I have to rename/move all my buckets now? Ugh.

It's worse than that. You can't rename a bucket. You will have to create a new bucket and copy everything over.

[+] dfsegoat|7 years ago|reply

FWIW - I found it fairly trivial to set up CloudFront in front of my buckets [1], so that I can use HTTPS with AWS Cert Mgr (ACM) to serve our s3 sites on https://mydomain.com [2].

I set this up some time ago using our domain name and ACM, and I don't think I will need to change anything in light of this announcement.

1 - https://docs.aws.amazon.com/AmazonS3/latest/dev/website-host...

2 - https://docs.aws.amazon.com/acm/latest/userguide/acm-overvie...

[+] the_mitsuhiko|7 years ago|reply

Was curious when someone would bring this up. This has been an issue for such a long time and still the docs are so quiet about it.

[+] dylan604|7 years ago|reply

isn't that domain name style bucket naming only for hosting a static website from an s3 bucket? otherwise, you can name the bucket whatever you want within the rest of the naming rules.

[+] BillinghamJ|7 years ago|reply

The point of that is solely for doing website hosting with S3 though - where you'll have a CNAME. Why would you name a bucket that way if you're not using it for the website hosting feature?

[+] TheLoneTechNerd|7 years ago|reply

Does anyone have insight on why they're making this change? All they say in this post is "In our effort to continuously improve customer experience". From my point of view as a customer, I don't really see an experiential difference between a subdomain style and a path style - one's a ".", the other's a "/" - but I imagine there's a good reason for the change.

[+] BillinghamJ|7 years ago|reply

Three reasons -

First to allow them to shard more effectively. With different subdomains, they can route requests to various different servers with DNS.

Second, it allows them to route you directly to the correct region the bucket lives in, rather than having to accept you in any region and re-route.

Third, to ensure proper separation between websites by making sure their origins are separate. This is less AWS's direct concern and more of a best practice, but doesn't hurt.

I'd say #2 is probably the key reason and perhaps #1 to a lesser extent. Actively costs them money to have to proxy the traffic along.

[+] Gasparila|7 years ago|reply

One big reason to me: cookie security

Currently all buckets share a domain and therefore share cookies. I've seen attacks (search for cookie bomb + fallback manifest) that leverage shared cookies to allow an attacker to exfiltrate data from other buckets

[+] zmmmmm|7 years ago|reply

The only obvious thing that occurs to me is that bringing the bucket into the domain name puts it under the same-origin policy in the browser security model. Perhaps there are a significant number of people hosting their buckets and compromising security this way? Not something I have heard of but it seems possible. Makes me wonder if they are specifically not mentioning it because this is the reason and they know there are vulnerable applications in the wild and they don't want to draw attention to it?

[+] chtitux|7 years ago|reply

Probably because they want to improve the response time with a more precise DNS answer.

With s3.amazonaws.com, they need to have a proxy near you that download the content from the real region. With yourbucket.s3.amazonaws.com, they can give an IP of an edge in the same region as your bucket.

[+] sl1ck731|7 years ago|reply

Does the "you are no longer logged in" screen not infuriate anyone besides me? There doesn't seem any purpose to it just redirecting you to the landing page when you were trying to access a forum post that doesn't even require you be logged in.

Absolutely mind boggling with as much as they pay people they do something so stupid and haven't changed it after so long.

[+] cddotdotslash|7 years ago|reply

This is going to break so many legacy codebases in ways I can't even imagine.

Edit: Could they have found a better place to announce this than a forum post?

[+] jasonkester|7 years ago|reply

I wonder how they’ll handle capitalized bucket names. This seems like it will break that.

S3 has been around a long time, and they made some decisions early on that they realised wouldn’t scale, so they reversed them. This v1 vs v2 url thing is one of them.

But another was letting you have “BucketName” and “bucketname” as two distinct buckets. You can’t name them like that today, but you could at first, and they still work (and are in conflict under v2 naming).

Amazons own docs explain that you still need to use the old v1 scheme for capitalized names, as well as names containing certain special characters.

It’d be a shame if they just tossed all those old buckets in the trash by leaving them inaccessible.

All in, this seems like another silly, unnecessary, depreciation of an API that was working perfectly well. A trend I’m noticing more often these days.

Shame.

[+] euank|7 years ago|reply

One of the weird peculiarities of path-style API requests was that it meant CORS headers meant nothing for any bucket pretty much. I wrote a post about this a bit ago [0].

I guess after this change, the cors configuration will finally do something!

On the flip side, anyone who wants to list buckets entirely from the client-side javascript sdk won't be able to anymore unless Amazon also modifies cors headers on the API endpoint further after disabling path-style requests.

[0]: https://euank.com/2018/11/12/s3-cors-pfffff.html

[+] chillaxtian|7 years ago|reply

A similar removal is coming in just 2 months for V2 signatures: https://forums.aws.amazon.com/ann.jspa?annID=5816

This could be just as disruptive.

Difficult to say that they will actually follow through, as the only mention of this date is in the random forum post I linked.

[+] ec109685|7 years ago|reply

Amazon is proud that they never break backwards compatibility like this. Quotes like the container you are running on Fargate will keep running 10 years from now.

Something weird is going on if they don’t keep path style domains working for existing buckets.

[+] sly010|7 years ago|reply

Is there a deprecation announcement that does not include the phrase "In our effort to continuously improve customer experience"?

Edit: autotypo

[+] bagels|7 years ago|reply

Fun fact: The s3 console as of right now still shows v1 urls when you look at the overview page for a key/file.

[+] reilly3000|7 years ago|reply

I was already planning a move to GCP, but this certainly helps. Now that cloud is beating retail in earnings, the ‘optimizations’ come along with it. That and BigQuery is an amazing tool.

It’s not like I’m super outraged that they would change their API, the reasoning seems sound. It’s just that if I have to touch S3 paths everywhere I may as well move them elsewhere to gain some synergies with GCP services. I would think twice if I were heavy up on IAM roles and S3 Lambda triggers, but that isn’t the case.

[+] manigandham|7 years ago|reply

This is most likely to help mitigate the domain being abused for browser security due to the same-origin policy. This is very common when dealing with malware, phishing, and errant JS files.

[+] lazyant|7 years ago|reply

`In our effort to continuously improve customer experience` , what's the actual driver here, I don't see how going from two to one option and forcing you to change if you are in the wrong one improves my experience.

[+] geekrax|7 years ago|reply

There are millions of results for "https://s3.amazonaws.com/" on GitHub: http://bit.ly/2GUVjDi

[+] merb|7 years ago|reply

I see a problem when using the s3 library to other services that support s3 but only have some kind of path style access like minio or ceph with no subdomains enabled. it will break once their java api removes the old code.

[+] pulkitsh1234|7 years ago|reply

    ag -o 'https?://s3.amazonaws.com.*?\/.*?\/'| awk -F':' '{print $1, $4}' | sort | uniq | cut -d'/' -f 1 | sort | uniq -c | gsort -h -rk1,1

For anyone interesting in finding out the occurrences in their codebase. (Mac)

[+] Roark66|7 years ago|reply

AWS API is an inconsistent mess. If you don't believe me try writing a script to tag resources. Every resource type requires using different way to identify it, different way to pass the tags etc. You're pretty much required to write different code to handle each resource type.

[+] mark242|7 years ago|reply

This will hopefully prevent malicious sites hosted on v1-style buckets from stealing cookies/localstorage/credentials/etc.

268 comments