Root Domain Website Hosting for Amazon S3

[+] mmastrac|13 years ago|reply

This is great news. Back when we were building DotSpots (now defunct, sadly), our website was built entirely in GWT. We weren't just serving a static site - we had the full functionality being run in GWT-compiled code. This meant that the majority of the site itself could be hosted from a static domain, and only the APIs behind the scenes needed to be running on our EC2 fleet. At least that was the idea.

We managed to get the majority of the static images and JS running off CloudFront back then, but were always stuck serving html from our EC2 boxes (a combination of what was available from AWS at the time, and the previous one-day minimum timeout of CloudFront). We put a lot of work into optimizing them so that they'd be super lightweight and could be served quickly. It was pretty fast considering that we weren't able to put all of the static-like files into S3/CloudFront.

Now that you can host the whole static bits of your domain through S3/CloudFront, I'd love to give this another shot with a future project. With strongly-named resources (ie: resource name == MD5) being served with one-year expiration times from Amazon's infrastructure, you could build a blazing fast dynamic site without having to spin up a single EC2 box.

Exciting!

(My original comment was on the dupe here... http://news.ycombinator.com/item?id=4976475)

[+] twistedpair|13 years ago|reply

One of the many reasons I <3 GWT. Why run all the logic on the server if it can live in the client? Currently working on the same with www.runpartner.com from S3/EC2/CF.

[+] saurik|13 years ago|reply

Why do people host static websites on S3 at all? It really isn't designed for that: it is an object store. Yes: it has a URL structure accessible in a way that makes it look like static hosting, and Amazon caved pretty early to people wanting to use it that way by adding features to make it more reasonable, but it doesn't fix the underlying problem.

Specifically, it is both allowed to--and often does--return 50x errors to requests. The documentation for the S3 API states you should immediately retry; that's fine for an API where I can code that logic into my client library, but is simply an unacceptable solution on the web. Maybe there are one or two exceptions to this, but I have simply never seen a web browser retry these requests: the result is you just get broken images, broken stylesheets, or even entire broken pages. Back when Twitter used to serve user avatar pictures directly from S3 the issue was downright endemic (as you'd often load a page with 30 small images to request from S3, so every few pages you'd come across a dud).

Sure, it only happens to some small percentage of requests, but for a popular website that can be a lot of people (and even for an unpopular one, every user counts), and it is orders of magnitude higher of a random error rate than I've experience with my own hosting on EC2; it is also irritating because it is random: when my own hosting fails, it fails in its entirely: I don't have some tiny fraction of requests for users all over the world failing.

Regardless, I actually have an administrative need to stop hosting a specific x.com to www.x.com redirect on some non-AWS hosting I have (the DNS is hosted by Route53, etc., but I was left with a dinky HTTP server in Kentucky somewhere handling the 301), and I figured "well, if it doesn't have to actually request through to an underlying storage system, maybe I won't run into problems; I mean, how hard is it to take a URL and just immediately return a 301?", but after just a few minutes of playing with it I managed to get a test request that was supposed to return a 301 returning a 500 error instead. :(

    HTTP/1.1 500 Internal Server Error
    x-amz-request-id: 1A631406498520D6
    x-amz-id-2: hXQ1YXyu0gxaiGITKvcB+P8+tgPsP3UITX/Or4emyjZtaL16ULAyHFx2ROT4QPXY
    Content-Type: text/html; charset=utf-8
    Content-Length: 354
    Date: Fri, 28 Dec 2012 07:19:24 GMT
    Connection: close
    Server: AmazonS3

This wasn't just a single-time problem either: I've setup a loop requesting random files (based on the timestamp of the test run and a sequence number from the test) off this pure-redirect bucket that I've left running for a few minutes, and some of the S3 nodes I'm talking to (72.21.194.13 being a great example) are just downright unreliable, often returning 500 errors in small clumps (that one node is giving me a 2% failure rate!!). S3 is simply not an appropriate mechanism for site hosting, and it is a shame that Amazon is encouraging people to misuse it in this fashion.

(edit: Great, and now someone downvoted me: would you like more evidence that this is a problem?)

[+] mixonic|13 years ago|reply

I've been messing with S3 for a new project involving the HTML5 canvas- So lots of CORS and canvas security concerns, PUTing objects from the browser, and desire for low-latency changes.

S3 has not been delivering. Here's a few reasons:

* S3 only provides read-after-write consistency for non-standard regions: http://aws.amazon.com/s3/faqs/#What_data_consistency_model_d... Since moving to US-West-1, we've had noticeably more latency. Working without read-after-write just isn't an option, users get old data for the first few seconds after data is pushed.

* CORS support is basically broken. S3 doesn't return the proper headers for browsers to understand how objects should be cached: https://forums.aws.amazon.com/thread.jspa?threadID=112772

* Oh, and the editor for CORS data introduces newlines into your config around the AllowedHost that BREAK the configuration. So you need to manually delete them when you make a change. Don't forget!

* 304 responses strip out cache headers: https://forums.aws.amazon.com/thread.jspa?threadID=104930... Not breaking spec right now, but quite non-standard.

* I swear, I get 403s and other errors at a higher rate than I have from any custom store in the past. But this is purely subjective.

Based on all this- I really need to agree with saurik that the folks at S3 aren't taking their role as an HTTP API seriously enough. They built an API on HTTP, but not an API that browsers can successfully work with. Things are broken in very tricky ways, and I'd caution anybody working with S3 on the front-end of their application to consider the alternatives.

I'm moving some things to Google Cloud Storage right now, and it is blazing fast, supports CORS properly, and has read-after-write consistency for the whole service. Rackspace is going to get back to me, but I expect they could do the same (and they have real support).

[+] zzzeek|13 years ago|reply

I host very small-time sites for a few family members on S3 because it is practically free (pennies per month) and there's more or less zero chance some script kiddie will break in and deface it, as was the case when they were going the traditional "php/wordpress on godaddy" route. EC2 is great but for hosting tiny non-money-making sites it's way more expensive and maintenance consuming, a micro comes out to $14 a month and a small comes out to $46 a month. For a site that gets hit a few hundred times a week tops, you're just paying for tons of idle time. A very rare 500 error (I've never seen that before) is not an issue in this case.

[+] jeffbarr|13 years ago|reply

We investigated your report of issues with requests, and found that one S3 host was behaving incorrectly. We identified the root cause and deployed a fix. Can you verify that we have fixed your issue?

[+] xingquan|13 years ago|reply

Hello,

Have you tried setting a redirection rule on your bucket so that when the 500 error occurs, S3 will automatically retry the request? You can set a redirection rule in the S3 console, and I think the following rule might work:

This will redirect all 500s back to the same location, effectively re-trying the request. This should cover the random 500 case but I'm not sure that it will work 100% of the time though.

[+] notaddicted|13 years ago|reply

It is very easy to set up S3 to provide the files to cloudfront ... have you seen any issues with that?

[+] mixedbit|13 years ago|reply

I'm in a middle of creating a web application, and my plan was to serve static files from S3. Based on your post, it seems like a really bad idea. If the problems are so apparent, I wonder why this is so generally accepted and recommended approach. One example, Heroku guide that praises putting static files on S3: https://devcenter.heroku.com/articles/s3

[+] ubojan|13 years ago|reply

Because it's easy? I have 100 static websites on S3. After initial setup of buckets, it's trivial to update/sync all of these sites using command line tool (I use S3 Sync) with one click on a batch file. And hosting on S3 is cheap.

[+] pixie_|13 years ago|reply

This is pretty interesting, I'd like to hear more about it. I'd also like Amazon to hear more about it because maybe they could treat web buckets differently or something.

[+] stevencorona|13 years ago|reply

I serve over 4 billion images out of S3 and don't have any issues with 50x errors.

[+] donretag|13 years ago|reply

Now only if Google App Engine would support the same.

The amount of features that AWS release is astonishing given their size. Great work.

[+] toomuchtodo|13 years ago|reply

Great work guys! Now can we work on functionality to prepay for AWS services? ;)

https://forums.aws.amazon.com/thread.jspa?threadID=51931

[+] nikcub|13 years ago|reply

Prepay and keeping user balances puts you in an entirely new compliance bracket that is much more complicated than charging a credit card and shipping a product. You get regulated almost like a bank.

I'm not surprised that they aren't bothering with all that yet

[+] 1qaz2wsx3edc|13 years ago|reply

Considering this is off topic and 2 year old request, I'd say drop it. Also I see little reason for this functionality, what is a use case that cannot be worked around?

[+] unknown|13 years ago|reply

[deleted]

[+] jvoorhis|13 years ago|reply

This is great news. Now I'm looking forward to root domain support for CloudFront.

[+] nodesocket|13 years ago|reply

Instead of using www.mydomain.com as the primary and 301 redirecting requests from mydomain.com, is it possible to do this the other way around? I want mydomain.com as the primary and 301 redirect www.mydomain.com.

[+] surrealize|13 years ago|reply

Yeah, that's what this example does:

http://docs.amazonwebservices.com/AmazonS3/latest/dev/websit...

"In this step, you will configure both buckets for website hosting. First, you will configure example.com as a website and then you'll configure www.example.com to redirect all requests to the example.com bucket."

[+] ceejayoz|13 years ago|reply

I just set it up with non-www as the primary, works like a charm.

[+] gregr|13 years ago|reply

Interesting. Root level cnames go against the RFC spec, yet it seems this solution is technically on par. Looking at the linked site, users mark the root as an A record and then select a new option in Route53 called 'Alias'. Since many DNS providers enforce actual A records, sounds like route53 will often be required for this feature (not a show stopper for most, just checking).

FTA: "In the Amazon Route 53 Management Console, create two records for your domain. Create an A (alias) record in the domain's DNS hosted zone, mark it as an Alias, then choose the value that corresponds to your root domain name.

Create a CNAME record and set the value to the S3 website endpoint for the first bucket."

[+] isb|13 years ago|reply

Route 53 alias feature is RFC compliant - we will return an A record for an apex alias. Aliases can be used at non-apex too to avoid following CNAME chains. So, for e.g., the configuration in docs can be improved upon by having www.example be an alias to example.com instead of a CNAME.

[+] saurik|13 years ago|reply

These 301 redirects seem to be billed as "WebsiteGetObject,Requests-Tier2", which I believe is $0.01 per 10,000 (not including bandwidth, and with no decrease in price for large scale). I mean, its nice that its hosted in the cloud and all, but billing at the same rate as a much more complex S3 GET request (or "GetObject,Requests-Tier2") seems much too expensive (although I can imagine that if you have only a small number of requests; if you have any other server on which you can host the 301 redirect, you can do so with a marginal cost near 0, even at a load of many millions of redirects per day.

[+] aneth4|13 years ago|reply

You can also use a service like dnsimple that offers more generalized aliases to any domain. As I understand it, you can only alias to AWS services with Route 53, but that may have changed.

I've been using Dnsimple for quite a while to host root domains on heroku.

[+] donavanm|13 years ago|reply

Sad face indeed. I wish route 53 had generic "alias" support like other providers. IIRC Route 53 can ALIAS to 1) your own hosted zone 2) an ELB instance 3) an S3 website bucket.

[+] philip1209|13 years ago|reply

You can do similar hosting on the Rackspace/Akamai CDN:

http://www.rackspace.com/blog/running-jekyll-on-rackspace-cl...

That's how the Obama fundraising website was hosted

[+] WestCoastJustin|13 years ago|reply

As far as I know, Rackspace was not involved with the Obama fundraising website (other than this blog post which talks about Jekyll). Obama fundraising website was Jekyll/Akamai/AWS[EC2,S3,RDS] [1]. You can also watch a video from most of the Obama ops team about the stack on AWS [2].

[1] http://net.tutsplus.com/articles/interviews/chatting-with-ob...

[2] https://www.youtube.com/watch?v=X1tJAT7ioEg

[+] arikrak|13 years ago|reply

I have sites on Amazon S3 and OpenShift, but my Domain Name registar (1&1) didn't support redirection, so I pointed it to Cloudflare and created a rule there to redirect from the root to 'www'. This looks simpler though.

[+] roozbeh18|13 years ago|reply

back when google apps for business was free, they had an option to redirect your naked domain to www under domains. then all you had to do was to point your A record to google dns and it would forward it to www.

[+] mleonhard|13 years ago|reply

If you want hosted root domain redirects without switching to Route 53, you can use https://www.RootRedirect.com/ . I run it on multiple EC2 instances and EIPs. Notable customers include Pizza Hut Australia and Airport Nürnberg. My 301 responses are cached on the client so subsequent redirects are instantaneous.

[+] bradgessler|13 years ago|reply

Check out shart at https://github.com/polleverywhere/shart if you're looking for an easy way to deploy static websites, like Middleman, to S3.

[+] magoon|13 years ago|reply

I host a web site for my small community on S3 because it's nearly free to do so and, for such a small site with no server-side programming, it performs better and is more fault tolerant than a cheap VPS.

[+] api|13 years ago|reply

Combined with the ability to set headers to permit AJAX requests to another site, it would be possible to host a webapp on S3 and have the "live" portion be a JSON API only.

[+] akulbe|13 years ago|reply

Given Amazon's recent outages (and the fact that they happen a lot more frequently)... hearing about this new service offering doesn't instill a lot of confidence.

[+] ceejayoz|13 years ago|reply

Essentially all outages in the last few years have been EBS related.

S3 doesn't have an EBS dependency, and has been pretty rock-solid for half a decade now.

[+] weakwire|13 years ago|reply

Time for heroku to implement root domain redirection!

[+] asc76|13 years ago|reply

AWS just keeps on getting better and better. Although, this particular release isn't that exciting and press-worthy.

[+] yosun|13 years ago|reply

i guess amazon no longer needs the naturally free publicity from having 90% of CDN content URLs being of the format xxx.s3.amazonaws.com ... you can now have CDN URLs that look like xxx.yourdomain.com :O ... AMAZING!!!

well, let's hope adding domain hosting to amazon doesn't add to further downtime.

[+] Inufu|13 years ago|reply

You could always have that, but now you can use just "yourdomain.com" instead of a subdomain.

[+] yosun|13 years ago|reply

sorry for sarc over-simplifying the new offering. :(

[+] mdrussell|13 years ago|reply

So now all your root domains can suffer extended downtime on a monthly basis too? :)

[+] zwily|13 years ago|reply

What are you talking about? The last big S3 outage was like 5 years ago.

[+] davidandgoliath|13 years ago|reply

Troll elsewhere ;)

83 comments