Are you kidding me? If you run big sites like FogBugz then ofcourse you have cross-datacenter redundancy. It's not complicated to host your staging site in another physical location and point the DNS records to it when things go pear-shaped.
Yes, so this staging site of you has exactly the same databases as your production site? Without customer data Fogbugz and Trello are useless. This means that this simple staging site of yours needs to have all data replicated to it, which means it also needs the same hardware provisioned for it, effectively doubling your physical costs, your maintenance cost and reducing the simplicity of your architecture. Ofcourse, if you're big enough you can afford to do this, and one could argue fogcreek is big enough. I'm just saying it's not a simple no-brainer.
What is a simple no-brainer how ever is to have offline offsite backups that can easily brought online. A best practice is to have your deployment automated in such a way that deployment to a new datacenter that already has your data should be a trivial thing.
But yeah, if you're running a tight ship something things like that go overboard without anyone noticing.
Remember the story of the 100% uptime banking software, that ran for years without ever going down, always applying the patches at runtime. Then one day a patch finally came in that required a reboot, and it was discovered that in all the years of runtime patches without reboots, it was never tested if the machine could actually still boot, and ofcourse it couldn't :)
Data should be backed up to staging nightly anyway. There should also be scripts in place to start this process at an arbitrary point in time and to import the data into the staging server. You do not need to match the hardware if you use cloud hosting since you can scale up whenever you want.
Here's where it gets really simple. Resize the staging instance to match live. Put live into maintenance mode and begin the data transfer to staging (with a lot of cloud providers, step #1 and #2 can be done in parallel). As soon as it finishes copying, take live down, point the DNS records at staging and wait for a few minutes. Staging is now live, with all of live's data. Problem solved. Total downtime: hardly anything compared to not being prepared. Total dataloss: none.
tinco|13 years ago
What is a simple no-brainer how ever is to have offline offsite backups that can easily brought online. A best practice is to have your deployment automated in such a way that deployment to a new datacenter that already has your data should be a trivial thing.
But yeah, if you're running a tight ship something things like that go overboard without anyone noticing.
Remember the story of the 100% uptime banking software, that ran for years without ever going down, always applying the patches at runtime. Then one day a patch finally came in that required a reboot, and it was discovered that in all the years of runtime patches without reboots, it was never tested if the machine could actually still boot, and ofcourse it couldn't :)
Supreme|13 years ago
Here's where it gets really simple. Resize the staging instance to match live. Put live into maintenance mode and begin the data transfer to staging (with a lot of cloud providers, step #1 and #2 can be done in parallel). As soon as it finishes copying, take live down, point the DNS records at staging and wait for a few minutes. Staging is now live, with all of live's data. Problem solved. Total downtime: hardly anything compared to not being prepared. Total dataloss: none.
almost|13 years ago
Unless you're just talking out of your arse of course and you have no experience with that sort of thing at all.
michaelhoffman|13 years ago
http://www.smbc-comics.com/?id=2475
unknown|13 years ago
[deleted]