Cause of YC/HN outage discovered

[+] pg|16 years ago|reply

Things should be back to normal now. Trevor just moved www.ycombinator.com to Slicehost, and I just told HN to refer to static stuff there instead of serving it locally.

Interesting how easy it is to move your whole web site. Good customer service is important when users can switch so easily.

[+] carbocation|16 years ago|reply

Any particular reason for Slicehost as opposed to any of the other similar options (Linode, prgmr)? I'm not affiliated with any, though I run my site on VPSes from Linode. I was convinced by a Dec 2009 post by an HNer, uggedal: http://journal.uggedal.com/vps-performance-comparison

[+] othello|16 years ago|reply

Thank you ! Procrastination becomes much harder when HN does not provide a continuous flow of interesting distraction.

[+] axod|16 years ago|reply

>> "Things should be back to normal now."

Hopefully that's the normal of months ago when loading comments/threads etc was quick?

Still taking 10-20 seconds to load most pages. One thing I haven't tried yet is just creating a fresh account - maybe mine just has too much associated with it (Maybe I comment too much etc).

edit: ah I see it only affected static content. Shame :/

[+] projectileboy|16 years ago|reply

Heh heh... Totally clueless companies are so cute sometimes; like watching a toddler try to eat chocolate cake.

[+] sh1mmer|16 years ago|reply

Dear Pair Networks,

I'd like to highlight this lesson in how to loose (or not gain) customers by randomly shutting down technology sites that serve the decision makers you wish to influence.

Tom

P.S. Well done Slicehost.

[+] gojomo|16 years ago|reply

150K hits in 30 minutes is 83 hits/second. That's a lot to ask for from a shared-hosting account.

When the traffic starts to impair neighboring sites, something has to be done. Just about any ISP will do the same thing: block the site with the surge, that could possibly make other arrangements, rather than inconvenience other customers whose traffic is as expected/usual.

The detail missing so far is why Pair noticed today, if it was the same level of traffic as before, or a slow build. Was a new threshold crossed? (Did someone's HN-focused tool go haywire?)

The Pair message suggests end-of-day logs will be the way to tell for sure.

[+] noonespecial|16 years ago|reply

Customers nothing. "I didn't know what that 'hacker news' thing was or if it should have that much traffic so I shut it off". Really?

Please turn in your geek card at the door.

[+] axod|16 years ago|reply

FWIW, It's probably not worth their effort of working out exactly what websites are using cheap shared hosting and overusing resources.

I don't get shared hosting at all. VPSs are dirt cheap these days.

[+] patio11|16 years ago|reply

I typically don't tell other people how to run their businesses, but if a similar issue brought my website down and I were to post about the causes, I might focus more on my failures in capacity planning, vendor selection, and monitoring rather than on my vendor's lackluster customer service. User-visible failures are, ultimately, process failures on my part, regardless of the surface cause. A nice side effect of this philosophy is that improvements to my processes help with all sorts of surface causes whereas if I were to address surface causes individually it would be like playing whack-a-mole. Bad vendor whack, hard drive failed whack, traffic spike whack, poor customer service whack, out of memory exception whack whack whack -- why is everyone conspiring to keep me from getting any work done.

[+] pg|16 years ago|reply

It didn't bring HN down. HN deliberately didn't rely on that server for anything except hosting static content that was also duplicated on this server. I planned in advance for the possibility that the other server wouldn't be usable, by writing the code so that I could switch to serving the same content off news by changing one variable, which I did. As a result service was barely affected.

In short, Pair flaked, but we had in fact planned the system in a way that protected us against it.

[+] davidw|16 years ago|reply

One obvious point is that HN is a peripheral part of YC's business, mostly aimed at recruiting, rather than something that's critical.

[+] unknown|16 years ago|reply

[deleted]

[+] swombat|16 years ago|reply

Shared hosting is generally sucky for anything remotely successful. I'm amazed you got away with storing your static assets there so long! When I got 100k hits in a day on my first blog, Dreamhost promptly shut it down without warning (in the middle of a slashdotting!)

[+] PostOnce|16 years ago|reply

Dreamhost? Their main advertising point is UNLIMITED TRAFFIC!!! I guess I'll scratch them off the list of potential hosts.

[+] DrewHintz|16 years ago|reply

Dreamhost also repeatedly shut down my sites, such as mapwow.com, due to too much traffic. One time they even denied they shut it down -- even though my web directory had been chowned to root, which requires root privilege.

I moved to slicehost and appengine[disclaimer,etc] and have been relatively happy with them.

[+] mikestanley|16 years ago|reply

Kinda funny and sad that so many people here only see this as some sort of stupid or unfair action against HN, seemingly without even acknowledging that every single other customer on that shared server had as much right individually, and more right collectively, to not have their performance negatively impacted by HN.

Yeah, it sucks that one of our favorite tech news sites was impacted by this, but how impacted were all those other customers?

It is easy to make a smartaleck comment about how Pair was trying to upsell by doing this, which is preposterous. Pair is a well-respected provider with many more years providing good service at a fair price than HN has existed, and I'd be willing to bet will be around after HN has peaked and begins to move back to the traffic load that might make sense on a shared system.

But the fault here ultimately lies with the folks running HN who thought it was wise or appropriate to host any of its content on a shared server that likely cost them less money per month than most of us spend on soft drinks in a week.

[+] euccastro|16 years ago|reply

Strawman. Nobody's saying HN/YC should be allowed to overuse resources (then again, I don't know the terms they had agreed to). But not giving a warning is crappy customer service, no matter how many other folks do likewise, how many years Pair has been doing other stuff well, or however else you want to spin it. In this case, it also happens to be a big sales screwup.

[+] bshep|16 years ago|reply

"We just figured out what caused the problem. Apparently Pair Networks' procedure for requesting that users upgrade to a dedicated server is to shut down their site without warning...."

I think you should upgrade to a more "dedicated provider" rather than a "dedicated server"...

But seriously, not even a warning?

[+] eli|16 years ago|reply

I think this is pretty common. When Joyent shut down my shared account way back when, I discovered it in a comment in my httpd.conf

[+] bwb|16 years ago|reply

Guys, shared hosting is designed for the 99% of non busy sites, say under 1,500 unique visitors a day, with the level of traffic HN is doing I'm surprised Pair didn't warn you earlier. A site this busy needs a dedicated server or virtual server.

That is why shared hosting is cheap, you start with it and once you are successful or starting to get slashdotted you buy something bigger that can scale.

[+] bwb|16 years ago|reply

Also, just going to point out here that Pair.com has had a great reputation in the hosting industry, and has for the last 10 years.

I've been in the industry for 10 years and worked for quite a number of hosting companies, not Pair though, and when you have 150 shared clients on a machine and 1 client is causing the problems you do your best to deliver the warning before it gets out of hand but it is very hard to do.

[+] jseliger|16 years ago|reply

I don't think anyone will disagree. What they will assert, however, is that shutting down a site with zero notice is a jerky thing to do. As the other commenters have said, the nicer/smarter thing to do would've been to send an e-mail that says "Hey, you're using too much bandwidth/whatever." The smartest thing of all probably would've been, "You're using too much. Want to by more?"

[+] archon810|16 years ago|reply

This is almost as bad as the reason for my server's downtime recently: http://beerpla.net/wp-content/uploads/img_2958.JPG

My server literally had its plug pulled.

[+] nkassis|16 years ago|reply

Are we at the point were we have to deal with robotic janitors too?

Geez, and I thought that watching the janitors while the work was over the top, now I gotta watch the rumba too :(

[+] samd|16 years ago|reply

I'm not sure what is more ridiculous: that they disabled the site without warning or that they were too lazy to look at their logs or do a simple Google search to find out that Hacker News is real site with lots of regular traffic.

[+] VBprogrammer|16 years ago|reply

What I really loved is that they just demonstrated a fair degree of incompetence to a site used pretty much exclusively by people who are very good potential customers! This site is a wet-dream for their marketing department.

What they should have done is upgrade the website to a dedicated server for free and let that news hit the front page.

[+] a2tech|16 years ago|reply

Thats some good customer service there Lou. (Said in Chief Wiggims voice)

Seriously-they couldn't just email the account holder?

[+] unknown|16 years ago|reply

[deleted]

[+] fletchowns|16 years ago|reply

This site was on a shared hosting box?

And of course they are going to disable it without warning if it's causing problems for all the other customers on that box...

[+] pg|16 years ago|reply

Not this site, www.

There was no sudden spike in traffic. If they'd bothered to check the logs they'd have found that the load, whatever it was, was no higher than it had been.

[+] anotherpaulg|16 years ago|reply

This makes me think of a little side project I've been chipping away at called InstaCDN.

It makes it easy to minify, combine, gzip and push your css, js and image assets into the Amazon Cloudfront CDN with far-future expiration headers. It also automagically detects background images referenced in your css, and puts them in the CDN. It rewrites the css to use the new CDN image urls.

It's all done through a trivial REST API.

Would love some feedback, and to find out if/how it's breaking any of your complex css/js.

http://www.instacdn.com/

[+] papaf|16 years ago|reply

That looks really cool. I'll be in a position to give it a go in a few months. One thing that would make me hesitate though, is not knowing what the potential pricing would be when you move from non-free.

[+] kordless|16 years ago|reply

Wordwrap on you page is pretty nasty on the iPad.

[+] moe|16 years ago|reply

And the Darwin award, category marketing, goes to...

[+] eli|16 years ago|reply

And this is why I don't recommend shared hosting for anyone

[+] bwb|16 years ago|reply

Because everyone you know has one of the busiest sites on the net?

[+] CGamesPlay|16 years ago|reply

The real WTF here is why HN considered it appropriate to run a production site on a shared host. Shared hosting is a bad idea all around if you are even slightly concerned about reliability. You don't control the server, so any 60 year old woman running a photoblog with a vulnerable WordPress version can bring your site down just by getting hacked.

[+] RyanMcGreal|16 years ago|reply

The email sounds pretty generic and policyish, which leads me to suspect it was auto-generated rather than typed out by some hapless sysadmin who's never heard of HN.

[+] InclinedPlane|16 years ago|reply

There are too many grammar errors for it to be auto-generated (I hope).

[+] rrhyne|16 years ago|reply

Really, we are all guilty. If we'd just get back to work, HN wouldn't have this problem. ;)

[+] jqueryin|16 years ago|reply

I actually find it quite embarrassing that the guy had no clue what HN or YC were. He had absolutely no idea... and he works for a hosting company. It's quite a shame. I really got a kick out of the fact he questioned the legitimacy of traffic as well.

[+] ramchip|16 years ago|reply

I don't really see why every hosting company employee out there is supposed to know about a certain small american investment firm started in 2005 and its associated news aggregator... we're probably an order of magnitude or so less popular than Reddit, which is itself rather niche. It's a small dot in the web world.

[+] MikeCapone|16 years ago|reply

There are probably a lot more hosting company employees out there than there are HN regulars (or even occasional) users.

[+] unknown|16 years ago|reply

[deleted]

[+] stuntgoat|16 years ago|reply

So your network traffic is guilty and sentenced until proven innocent.

102 comments