Got to love their last two updates. Spot the difference :).
UPDATE: Process management error rates remain high. Engineers are continuing to investigate and work towards a resolution.
FEB 25, 2012 – 20:27 UTC – 10 MINUTES AGO
UPDATE: Process management error rates remain high. Engineers are continuing to investigate and resolve the issue.
FEB 25, 2012 – 20:11 UTC – 26 MINUTES AGO
Yet another reminder that well services like heroku are great, its up to you to make sure you have the appropriate redundancy in place to make sure your critical app doesn't go down. If you host anything critical on heroku, you _need_ a backup server elsewhere (not ec2!)
But doesn't that defeat the whole point of Heroku? If your app is running redundantly elsewhere, that means that you've already done all the work to set up the stuff that Heroku normally provides. And if you've already done that work, why use Heroku at all?
Having this problem too. The thing that I'm worried about is that my site just keeps trying to load without any error message or 404 page appearing. Is there a way to get something to show up to inform users of the downtime?
(Or ideally, a way to point towards another instance of the site quickly. I'm worried the DNS wouldn't propagate fast enough)
isnt heroku based on EC2? they should be able offer some type of redundant package for customers willing to pay for it. i mean, lets be serious, people WILL pay for it, assuming it works.
Heroku should know better and not host their status page on their servers since it becomes unaccessible when most needed. This problem started last night and they detected it this morning, monitoring system FAIL.
It appears to be hosted on rackspace (right now at least). But it doesn't look automatic so perhaps the people were not available to update it til this morning.
If some bits of Heroku seem to be up, but not others, it could be because those bits are running older versions of the infrastructure. Or are being used as test beds for newer versions of the infrastructure. Or are deliberately running on a separate platform so that, when the main Heroku infrastructure starts having problems, other bits of Heroku's domain are still around to dispense advice on how to work around those problems.
(The most extreme example is a company's "status" subdomain, which ideally should be hosted on a completely different server, in a completely different datacenter, on a different continent located on a distant planet with different DNS regulations.)
The devcenter site is absolutely hosted on the Heroku platform.
I don't have first hand knowledge of this specific incident, but I do have a very deep understanding how the Heroku platform as a whole works. It is incredibly likely for many service disruptions to affect a only subset of applications, and not the platform as a whole. The system is, in fact, specifically designed to isolate failures of individual components to as small a failure domain as possible. That doesn't always work, but implying that there's something nefarious going on or that Heroku is not eating their own dogfood with the devcenter site is dead wrong.
[+] [-] carsongross|14 years ago|reply
They screwed up. They'll fix it. Any ops team and hosting solution I could hire for a comparable amount of money wouldn't even come close.
[+] [-] edouard1234567|14 years ago|reply
UPDATE: Process management error rates remain high. Engineers are continuing to investigate and work towards a resolution. FEB 25, 2012 – 20:27 UTC – 10 MINUTES AGO
UPDATE: Process management error rates remain high. Engineers are continuing to investigate and resolve the issue. FEB 25, 2012 – 20:11 UTC – 26 MINUTES AGO
[+] [-] jared314|14 years ago|reply
[+] [-] matt2000|14 years ago|reply
[+] [-] jvehent|14 years ago|reply
Real high availability comes with a price: redundant datacenter with BGP failover.
[+] [-] dholowiski|14 years ago|reply
[+] [-] lukev|14 years ago|reply
[+] [-] DannyPage|14 years ago|reply
(Or ideally, a way to point towards another instance of the site quickly. I'm worried the DNS wouldn't propagate fast enough)
[+] [-] wilfra|14 years ago|reply
Application Error An error occurred in the application and your page could not be served. Please try again in a few moments.
If you are the application owner, check your logs for details.
[+] [-] overworkedasian|14 years ago|reply
[+] [-] edouard1234567|14 years ago|reply
[+] [-] rhizome|14 years ago|reply
[+] [-] edouard1234567|14 years ago|reply
[+] [-] ohgodthecat|14 years ago|reply
[+] [-] wilfra|14 years ago|reply
http://blog.warsocial.com/post/18262715784/mmmm-dog-food
[+] [-] mechanical_fish|14 years ago|reply
If some bits of Heroku seem to be up, but not others, it could be because those bits are running older versions of the infrastructure. Or are being used as test beds for newer versions of the infrastructure. Or are deliberately running on a separate platform so that, when the main Heroku infrastructure starts having problems, other bits of Heroku's domain are still around to dispense advice on how to work around those problems.
(The most extreme example is a company's "status" subdomain, which ideally should be hosted on a completely different server, in a completely different datacenter, on a different continent located on a distant planet with different DNS regulations.)
[+] [-] imbriaco|14 years ago|reply
I don't have first hand knowledge of this specific incident, but I do have a very deep understanding how the Heroku platform as a whole works. It is incredibly likely for many service disruptions to affect a only subset of applications, and not the platform as a whole. The system is, in fact, specifically designed to isolate failures of individual components to as small a failure domain as possible. That doesn't always work, but implying that there's something nefarious going on or that Heroku is not eating their own dogfood with the devcenter site is dead wrong.
[+] [-] c00w|14 years ago|reply
[+] [-] wilfra|14 years ago|reply