We used to ask people all the time if the would send us their sanitized logs so we could see the "reddit effect", or at least give us the aggregate data.
I've noticed that it really depends on what subreddit your link gets featured in on Reddit.
I still get a lot of traffic from r/food, but it was relatively miniscule compared to the traffic I got from a post in r/reddit.com (which no longer has any incoming traffic).
For a number of years, I worked for technology news sites. The Slashdot effect frequently brought sites down, particularly in the earlier years when /. seemed to have more influence (this was before HN, Stack Overflow, TC and a number of other blogs) and load-balancing, cache and other elements were not optimally configured on our end. Breaking news or particularly controversial topics could bring upwards of 50,000 uniques/hour. This was a big deal for us.
Global news sites can experience far bigger surges when a major story breaks. Around 10 years ago CNN started switching out its regular home page for a version that stripped out video and most images when major stories broke. Not sure if CDNs and better Internet video technologies have reduced the need for this ...
I created the account yesterday simply to share the story that Slashdot actually picked up. Then I emailed info@ here to try to get 'CmdrTaco' back. It sucks that someone is squatting my nick here!
I'm not sure how it compares to slashdot, but I have had some of blogs on thingist at the top of HN, reddit/r/programming, getting tweeted by lots of my heros, on adafruit (which was really cool, since I really look up to Limor Fried, a lot).
That is one of the coolest feelings in the entire world for a little nerd like myself. Usually
tail -f /var/log/apache2/access log
is running on one of my monitors all the time, so I can pretty much watch visitors as they come into the site (I don't even see the code, I just see blonde, brunette, redhead). When I've gotten that sort of traffic, I can't make sense anymore. It's the difference between taking a drink from a fountain, and sticking your face down a geyser.
Not only that, but to then watch your code stand up to the traffic is cool :).
Like a webdev version of your first kiss, I suppose :)
It's a bit surreal. I once posted a picture on my petty little website, and posted a single link somewhere. The link ended up on DrudgeReport.com. Soon couldn't even FTP the site for some vital maintenance[1]. Had to call the ISP:
"Hello, Foobar ISP, can I help you?"
"Hi, I'm the reason your servers are melting down right now."
"Ah. Let me connect you to the company president..."
Told him the detailed story, he got a kick out of it, and deleted the attracting file.
([1] - AP took issue with what I thought "fair use" of the Elian Gonzales pictures the day they were published. No argument; I didn't expect concocting them into an animated time-lapse sequence would get _that_ popular.)
Is anyone else surprised the EC2 Micro did so well?
If you read the EC2 forums[1] for any amount of time you get used to seeing post after post of "hung sites" running on the Micros when a constant level of demand is made on them (not the intended usage model[2])
Under load the Micros frequently go into a catatonic state with %st ('top') climbing to the moon as the VM environment provisions the brunt of the server's resources to the other VMs on the machine and starves out any hungry Micro instances (as designed).
I didn't think you could host anything on them with regularity. Maybe a swarm of them behind an ELB, not not a single one... anyone else had the same experience that Malda had?
Last time I tried the micro's they were mostly starving on I/O.
Quite possibly his webserver wasn't doing much of that?
Also "tens of thousands" of requests is not really a big number anymore. Nginx will happily serve a couple thousand per second even from a micro (until it gets choked by the EC2 resource beancounting).
On micros, if the load is CPU intensive, the vCPU gets throttled pretty quickly for sustained loads (say over 10 - 15 seconds) and your Steal Time CPU percentage goes up -the hypervisor will give someone else real CPU cycles before coming back to serve you. So, they are good for low traffic sites with bursty loads but not good for sustained traffic/loads.
I understand that Amazon's S3 CDN is pay for what you use. If anyone here has had a traffic spike using Amazon's CDN, what amount did you owe Mr. Bezos? I would hate to have to tell my wife that we are buying groceries with the credit card for a while because my little website was on slashdot.
Technically S3 is not Amazon's CDN (CloudFront is) it just behaves like one and a lot of people use it like one.
For a localized readership, it is fine since all the data will be coming out of a single regional location. If you have global readership and need locality, that is where CloudFront comes in handy with 20 POPs around the globe that data is distributed to.
Serving off of S3 is not that expensive, even if your site goes gangbusters.
Serving off of CloudFront can surprise you because of the sometimes unexpected number of origin pulls that can occur as your data is expired from the edge locations.
You are sharing cache space on each edge node with every other CloudFront user, if your content is red hot it stays in the cache, but if it is low-volume (which is a relative term to the other traffic coming out of the node) it gets expired much faster, sometimes hours so any future hits for it will pull (download) it again from S3 to CloudFront, then back out from CloudFront to the client.
The performance implications aren't horrific (they can be for video) but the cost can double what you are paying with enough origin pulls occuring requiring redownload over and over and over again for the same files from the edge locations.
I have seen this catch a handful of people offguard to the tune of $100s of dollars or thousands on the forums over the last few years because they didn't realize this could happen... they just looked at the bandwidth rates on the site, multiplied by their payload sizes and thought that was the fixed rate.
Use CloudFlare[0] with your site. They cache your static content for free and estimate a 60% bandwidth savings as a result. Apparently you can continue to S3 host your content and they'll cache that too.
Disclaimer: I haven't used them, but I've heard good things about the service.[1]
You can use the APIs to spin up more instances when traffic goes up (and then close them down when you're not using them), but that doesn't happen automatically. Other than that, bandwidth costs are very high at Amazon compared most places, but it's still only $120/TB. https://aws.amazon.com/ec2/pricing/
My ad blocking project was slashdotted years ago. I remember looking at the apache logs in analog and going "Holy shit." Not only was the traffic huge but it lasted a few days. I just assumed that it would die off at 5pm.
My cheap webhost managed to handle it pretty well. No issues. Then again it was static HTML.
I just read taco's site. He doesn't mention what CMS he is using, but its Wordpress. I wonder what cache plugin he's using. That's pretty important stuff, just as important as what instance of EC2 he is using.
I got slashdotted today (albeit by Slashdot Japan), and hit by Hacker News yesterday.
I will put up some stats later, but HN brought in 10,000 hits, Slashdot brought in about 600 people. I think the English vs. Japanese thing scared away quite a few people (ironic considering the content of the article..)
(As an odd note, I did panic after hearing about Slashdot but not after seeing people from HN - go figure)
It varies widely I imagine- you can see it on Slashdot very obviously: some stories get 1k comments, others get 50. There is a HUGE variance in interest on stories- and it's not always predictable. But when the planets align, I bet the traffic generated could easily be low 6 figures to a single site, and that would be over 3-6 hours.
[+] [-] jedberg|14 years ago|reply
We used to ask people all the time if the would send us their sanitized logs so we could see the "reddit effect", or at least give us the aggregate data.
[+] [-] CmdrTacoMalda|14 years ago|reply
[+] [-] calloc|14 years ago|reply
I still get a lot of traffic from r/food, but it was relatively miniscule compared to the traffic I got from a post in r/reddit.com (which no longer has any incoming traffic).
[+] [-] ilamont|14 years ago|reply
Global news sites can experience far bigger surges when a major story breaks. Around 10 years ago CNN started switching out its regular home page for a version that stripped out video and most images when major stories broke. Not sure if CDNs and better Internet video technologies have reduced the need for this ...
[+] [-] zokier|14 years ago|reply
or in other words after 9/11 melted servers everywhere.
[+] [-] dfranke|14 years ago|reply
[+] [-] CmdrTacoMalda|14 years ago|reply
[+] [-] blhack|14 years ago|reply
That is one of the coolest feelings in the entire world for a little nerd like myself. Usually
is running on one of my monitors all the time, so I can pretty much watch visitors as they come into the site (I don't even see the code, I just see blonde, brunette, redhead). When I've gotten that sort of traffic, I can't make sense anymore. It's the difference between taking a drink from a fountain, and sticking your face down a geyser.Not only that, but to then watch your code stand up to the traffic is cool :).
Like a webdev version of your first kiss, I suppose :)
[+] [-] ctdonath|14 years ago|reply
"Hello, Foobar ISP, can I help you?"
"Hi, I'm the reason your servers are melting down right now."
"Ah. Let me connect you to the company president..."
Told him the detailed story, he got a kick out of it, and deleted the attracting file.
([1] - AP took issue with what I thought "fair use" of the Elian Gonzales pictures the day they were published. No argument; I didn't expect concocting them into an animated time-lapse sequence would get _that_ popular.)
[+] [-] rkalla|14 years ago|reply
If you read the EC2 forums[1] for any amount of time you get used to seeing post after post of "hung sites" running on the Micros when a constant level of demand is made on them (not the intended usage model[2])
Under load the Micros frequently go into a catatonic state with %st ('top') climbing to the moon as the VM environment provisions the brunt of the server's resources to the other VMs on the machine and starves out any hungry Micro instances (as designed).
I didn't think you could host anything on them with regularity. Maybe a swarm of them behind an ELB, not not a single one... anyone else had the same experience that Malda had?
[1] https://forums.aws.amazon.com/thread.jspa?threadID=58323
[2] http://docs.amazonwebservices.com/AWSEC2/latest/UserGuide/in...
[+] [-] moe|14 years ago|reply
Also "tens of thousands" of requests is not really a big number anymore. Nginx will happily serve a couple thousand per second even from a micro (until it gets choked by the EC2 resource beancounting).
[+] [-] mc32|14 years ago|reply
[+] [-] ulvund|14 years ago|reply
[+] [-] utefan001|14 years ago|reply
[+] [-] rkalla|14 years ago|reply
For a localized readership, it is fine since all the data will be coming out of a single regional location. If you have global readership and need locality, that is where CloudFront comes in handy with 20 POPs around the globe that data is distributed to.
Serving off of S3 is not that expensive, even if your site goes gangbusters.
Serving off of CloudFront can surprise you because of the sometimes unexpected number of origin pulls that can occur as your data is expired from the edge locations.
You are sharing cache space on each edge node with every other CloudFront user, if your content is red hot it stays in the cache, but if it is low-volume (which is a relative term to the other traffic coming out of the node) it gets expired much faster, sometimes hours so any future hits for it will pull (download) it again from S3 to CloudFront, then back out from CloudFront to the client.
The performance implications aren't horrific (they can be for video) but the cost can double what you are paying with enough origin pulls occuring requiring redownload over and over and over again for the same files from the edge locations.
I have seen this catch a handful of people offguard to the tune of $100s of dollars or thousands on the forums over the last few years because they didn't realize this could happen... they just looked at the bandwidth rates on the site, multiplied by their payload sizes and thought that was the fixed rate.
[+] [-] biot|14 years ago|reply
Disclaimer: I haven't used them, but I've heard good things about the service.[1]
[0] https://www.cloudflare.com/
[1] http://news.ycombinator.com/item?id=2631019 http://news.ycombinator.com/item?id=2561341
[+] [-] sp332|14 years ago|reply
[+] [-] sosuke|14 years ago|reply
[+] [-] nightpool|14 years ago|reply
[+] [-] burgerbrain|14 years ago|reply
[+] [-] drzaiusapelord|14 years ago|reply
My cheap webhost managed to handle it pretty well. No issues. Then again it was static HTML.
I just read taco's site. He doesn't mention what CMS he is using, but its Wordpress. I wonder what cache plugin he's using. That's pretty important stuff, just as important as what instance of EC2 he is using.
[+] [-] qx24b|14 years ago|reply
[+] [-] jbm|14 years ago|reply
I will put up some stats later, but HN brought in 10,000 hits, Slashdot brought in about 600 people. I think the English vs. Japanese thing scared away quite a few people (ironic considering the content of the article..)
(As an odd note, I did panic after hearing about Slashdot but not after seeing people from HN - go figure)
[+] [-] thenextcorner|14 years ago|reply
/. still relevant..!
[+] [-] CmdrTacoMalda|14 years ago|reply
[+] [-] bestes|14 years ago|reply
[+] [-] csytan|14 years ago|reply
[+] [-] CmdrTacoMalda|14 years ago|reply