(no title)
zytek | 9 years ago
.. do you actually TEST those backups?
This questions comes from my experience as a system engineeer who found a critical bug in our MySQL backup solution that prevented them from restoring (inconsistent filesystem). Also, a friend of mine learned the hard way that his Backblaze backup was unrestorable.
majewsky|9 years ago
By the way, I misread your username and, for a second, thought you were sytse.
vassilevsky|9 years ago
Juliate|9 years ago
trcollinson|9 years ago
Application
This is by far the easiest for me to test. We have a CI/CD jon which literally makes a new environment, from scratch, and deploys our application to it in a production configuration. It runs a test suite which tests functionality across the application. Finally, it destroys the environment. It reports on each portion of the process. In this way we know exactly how long it would take to redeploy the entire application from scratch on a new infrastructure and get it up and running. This morning it took about about 6 minutes total before tests ran.
Database
We are running an RDBMS. We use a combination of daily full backup, incremental transaction log like backup, and point in time backup. Again, in our CI/CD when a full backup is taken it is pulled, loaded, and a test routine is run against it to check integrity. At this time, the recovery from the day before is destroyed. When a transaction log backup is made, CI/CD picks up this change and applies it to the full backup restore and runs a set of tests for integrity check. This leaves us with a warm standby ready to be switched over to in case of the main database server going down. We have never had to use the warm standby in an emergency but we have a test to make sure we can cut that over as well.
For point in time backup testing this goes back to our application test above. The application test will spin up with a point in time recovery of the database backup. It will test the integrity of that recovery and then test the application against it. Finally, it will swap from the point in time recovered database to the warm backup. It runs the test suite against that for integrity as well.
File Store
People often forget this but those buckets that get hold all of your file storage in the cloud can be destroyed so easily (sad, sad experience taught me this). We test those as well. I am sure you can guess at this point how we do that? CI/CD. It's a rather simple process with a ton of gain.
A few notes
People always ask me this, so I will answer it first. Yes this costs money. It's not as bad as running a second production environment. But it will cost you a bit. My follow up question is, how much does downtime cost you?
My CI/CD is always Gitlab CI at this point. I've used Jenkins. I've used Travis. I like Gitlab CI. You can do all of this with any of those.
We script literally everything. Computers are so good at repetitive tasks. Why would you EVER do anything manually? Really. If it has to do with your infrastructure, script it.
If anyone has any questions about these ideas, feel free to reach out.
RossM|9 years ago
BatFastard|9 years ago
jmathai|9 years ago
[1] https://medium.com/vantage/how-to-protect-your-photos-from-b...
[2] https://medium.com/swlh/my-automated-photo-workflow-using-go...
marcosdumay|9 years ago
For personal backup of files, I just verify the results are in place. I've checked them once or twice, but honestly, I'm more concerned about my scripts stopping running than they running and not being correct.
Daviey|9 years ago
chadcmulligan|9 years ago
chopin|9 years ago
zapu|9 years ago
ValentineC|9 years ago
I moved off CrashPlan in 2016 because their upload speed continues to be embarrassingly slow outside the US even with deduplication and compression turned off (they have a datacentre where I'm at, but it's for Enterprise customers only).
They also highly recommend having 1GB of RAM for every 1TB backed up, which sounded a bit unreasonable to me.
sharkoz|9 years ago
m_samuel_l|9 years ago
mironathetin|9 years ago
yes (of course), see my post above.
I have used time machine repeatedly to restore lost or damaged files. I also replaced harddrives several times and played back my carbon copy clone. It boots and I have never missed a file in years.
notheguyouthink|9 years ago
The net result is it's super simple to verify an entire datastore as being valid or not.
atmosx|9 years ago
RubyPinch|9 years ago
CodeWriter23|9 years ago