top | item 14329524

Ask HN: Secure automated backup?

62 points| jamesknelson | 8 years ago

Hi HN,

After seeing all the news about lost data recently, I need to get my arse into gear and get an automated backup set up properly.

I'm using a Mac, so I looked into the Time Capsule. That said, if one of the data loss scenarios is a well-written ransomeware worm, it feels like the Time Capsule is going to be just as vulnerable as my main machine.

What approach would you recommend to back up data, with both hard drive failure and ransomware in mind? I'm open to cloud based solutions if that actually makes more sense.

86 comments

order
[+] 2bluesc|8 years ago|reply
I use borg[0] to create local space efficient encrypted backups and rclone[1] to mirror the archives to Google Drive. I wrote a short script to automate it and schedule it to run every night.

[0] https://borgbackup.readthedocs.io/en/stable/

[1] https://rclone.org

[+] y4mi|8 years ago|reply
You don't have the Google drive app installed anywhere?

If you do, this setup doesn't help recovery after a cryptolocker​. The encrypted backup would also be unusable.

[+] rollcat|8 years ago|reply
Currently at $WORK using Attic (which Borg was forked from), with plans to migrate to Borg.

At home, rsync to NAS and ZFS snapshots.

[+] olalonde|8 years ago|reply
I'm surprised that no one mentioned Tarsnap yet, it's run by a well known HNer (cperciva): http://www.tarsnap.com/

It's not exactly noob friendly though.

[+] Negitivefrags|8 years ago|reply
The main issue with Tarsnap we had is it's very slow to restore if you have big backups. Very slow. Big in this case is in the order of a TB or so.

We had an incident were we needed to restore some data from backups recently, and it took literally days to get the files we needed back. We were not downloading the entire backup, we just wanted to restore a small subset of files.

We migrated away after that.

[+] oliwarner|8 years ago|reply
Amazingly expensive though at 25c/GB/m.

I pay Backblaze b2 about $1.60 a month for 280GB of photos. A number that doubles every few years. Today that would cost me $60 on tarsnap. That's not reasonable.

[+] msh|8 years ago|reply
Well it's quite expensive compared to other solutions.
[+] _cjk7|8 years ago|reply
There's a common rule called the 3-2-1 rule, it states that you should:

- Have at least three copies of your data.

- Store the copies on two different media.

- Keep one backup copy offsite.

Personally, I'd recommend:

Copy 1: Your Mac.

Copy 2: A local NAS (my personal choice) or hard disk.

Copy 3: A remote backup, stored on a hard drive in a desk drawer at work, Backblaze, Google Drive, Amazon Cloud Drive or whatever other solution suits your needs.

In terms of software, I personally use rsync + ZFS/BTRFS snapshots (NAS - local, NAS2 - remote) and rclone (cloud). I haven't really used fancy solutions like Attic and Borg due to their need to write dead (i.e. not mountable without a performance penalty) data to local disk or SSH. No affordable storage that I've found offers this (rsync.net offers it but is too expensive).

It's getting to the point where I'm seriously considering buying an LTO6/7 tape drive though...

I'll also add because I haven't seen it elsewhere: verify your backups. A backup is pointless unless you know you can restore it. The best way to test this is by doing it. It should get to the point where you don't fear a restore. It shouldn't be painful. There should be no worry. It should be no more than an inconvenience. When something goes wrong, you don't want there to be even the smallest hint of doubt that there's something wrong with your process.

As such, I strongly recommend having an easily accessible backup. I'd go for a spare HDD sitting in a desk drawer at home before going for cloud backups just so that you can test it frequently.

[+] simonhorlick|8 years ago|reply
It's also worth thinking about time to restore. If you have hundreds of GB worth of backups it could take a very long time to restore everything from the internet. Keeping an easily accessible backup around is really worth it.
[+] goerz|8 years ago|reply
I use Arq (https://www.arqbackup.com) with Amazon Drive (unlimited data for $60/year) for this
[+] AdamGibbins|8 years ago|reply
I also use Arq but send to rsync.net with reduced pricing (http://rsync.net/products/attic.html) in addition to SFTPing to a personal (offsite) server.

Additionally, I run Backblaze and use Carbon Copy Cloner roughly once a week back to clone my entire drive to an external drive.

For personal servers I use borg with the same reduced rsync.net pricing.

[+] j_s|8 years ago|reply
In case it makes a difference for anyone: Amazon Drive does not allow commercial use. It does seem like the best deal price-wise right now.
[+] jmathai|8 years ago|reply
I have a setup which works really well for my photos and videos [1][2][3][4][5]. It automatically keeps a copy of each file in 3 locations; my laptop, a Synology NAS and Google Drive / Photos.

[1] https://medium.com/@jmathai/introducing-elodie-your-personal...

[2] https://medium.com/@jmathai/understanding-my-need-for-an-aut...

[3] https://medium.com/@jmathai/my-automated-photo-workflow-usin...

[4] https://medium.com/@jmathai/one-year-of-using-an-automated-p...

[5] https://medium.com/vantage/how-to-protect-your-photos-from-b...

[+] sidmitra|8 years ago|reply
I used to use Crashplan which had unlimited storage and was fairly cheap(like 4$/month or something) for a family plan.

You might want to check it out. https://www.crashplan.com/en-us/features/

Also it was one of the few services that had a client that worked on Linux

[+] seanlane|8 years ago|reply
I've been using Crashplan for my extended family as well, you get 10 machines with unlimited storage for something like $120 per year. Linux client has been working great, I'd definitely recommend them.
[+] tedmiston|8 years ago|reply
I used to use Crashplan too. The Java client brought my rMBP to a halt when it spun up though. It made the computer practically unusable.
[+] cube00|8 years ago|reply
Second CrashPlan; even allow you to encrypt with your own key (just don't lose it!) Linux client is fantastic.
[+] JoshTriplett|8 years ago|reply
"used to use" is an interesting endorsement. Why don't you use it anymore?
[+] Sidnicious|8 years ago|reply
Here are some options that I have experience with:

- Time Machine with offline disks: Since Time Machine supports multiple backup destinations, you can use a Time Capsule or hard drive that's always connected to your Mac, and also have one or more additional hard drives which you connect periodically and otherwise leave in a drawer.

Pros: Free, built into macOS, can browse file versions directly from many apps.

Cons: Needs ongoing manual intervention (i.e. plugging in the offline drives). Some reliability issues… but I've experienced the most problems backing up to my own SMB/AFP shares, so a Time Capsule might be OK.

- Backblaze (https://www.backblaze.com/) or CrashPlan (https://www.crashplan.com/): Both of these online backup services have $5/month unlimited plans, and both let you specify your own encryption key (in the form of an additional password), which isn't shared with the backup provider. Note: In my experience, Backblaze's client is much lighter on system resources/battery on Mac.

Pros: Inexpensive, off-site storage, low-maintenance.

Cons: Ongoing cost, requires trust (In theory, the client software could be sharing the encryption key with the company/the NSA/your nemesis).

- Arq (https://www.arqbackup.com/): Paid desktop software which can back up to many different destinations, including S3, Google Drive, or your own server via SFTP. You specify an encryption key for each destination.

Pros: Full control. Option to back up to another machine that you own (so no ongoing cost for hosting).

Cons: Up-front cost. Support is less straightforward than hosted solutions since Arq doesn't provide storage.

[+] tedmiston|8 years ago|reply
An unlisted con of Backblaze is that they delete all external drives if not plugged in for at least 6 consecutive hours every 30 days. It can be a huge pain if you travel regularly or otherwise don't want to leave your computer on all night.
[+] Faaak|8 years ago|reply
Most importantly: it must be the backup server that has to log into your computer to backup, and not the other way around. That way, if your computer/server is compromised, the backups are still there. If you make the error to connect to the backup server, a hacker could also log into it and delete everything.

I my backup server uses rsnapshot and you can only log into it with ssh + key + OTP.

[+] SCdF|8 years ago|reply
I use Time Machine, Arq and Amazon Cloud Drive:

- I have an external HDD partitioned in half: One half is for large external files that don't change much (raw files, archived data etc); and one half is a dedicated partition for Time Machine

- Time Machine backs up my laptop. If I lose my computer but not my hard drive, I can get a new one and seamlessly get the computer back to exactly how it was when I last backed it up, open tabs and all

- I also have Arq running, attached to Amazon Cloud Drive (cheapest external storage I know of). It backs up both selected portions of my laptop's disk, as well as the external hdd's non-timemachine partition (due to how TM works you can't really back it up to the cloud[1]) to "the cloud"

This leaves me with:

- Three copies of my laptop data: in the laptop, in an external hdd and in the cloud

- Two copies of larger data that can't fit, in the external hdd and in the cloud. My external HDD lives at home.

[0] Time Machine backups up once an hour, and stores backups as a simple directory structure on disk of your entire hard drive, except using hard links to old backups to avoid duplication. It keeps the last 24 hrs of hrly backups, the last 7 days of daily backups, and then weekly backups until it runs out of room.

This format simply doesn't work with the kind of backup where it scans a directory to see what's changed, because it effectively looks like you're adding hundreds of gigs of data each hour.

[+] bedros|8 years ago|reply
I second borg backup, I use it on my linux/mac machines

for windows I use reflect backup https://www.macrium.com/products/home

I tried acronis backup, but the disk restore failed, absolutely horrible software. then tried reflect disk restore was very smooth.

[+] feelix|8 years ago|reply
For local bootable backup I use Mac Backup Guru, which I also wrote: https://macdaddy.io/mac-backup-software/ It's useful because it's the only software on OS X besides Time Machine which makes versioned (incremental) backups using hardlinks.

For remote backup I use Arq, but I have found that to be very buggy. I'm considering switching to rclone: https://rclone.org/

With both of those backup solutions in place I should be ready for pretty much everything.

[+] zapu|8 years ago|reply
Can you elaborate how Arq is buggy? I'm considering switching to Arq from Crashplan, because I supply my own storage for it anyway.
[+] gtf21|8 years ago|reply
I quite like Crashplan:

- very reasonably priced: I pay around £10 pcm for unltd storage for my whole family

- zero-knowledge encryption: I have the encryption keys, and everything is encrypted on my machine before its sent up

- relatively low bandwidth: only ships changed files (pretty standard tbh)

It's saved my bacon a few times, e.g. I've used it to rescue my sister's dissertation when she wiped her laptop thinking it was in Dropbox when it wasn't. I was amazed by how easy it was for me to rescue the file from the archive.

[+] 2bluesc|8 years ago|reply
I used crash plan on Linux for years, but stopped because their Java client is a train wreck.

It would consume gigabytes of RAM and every year or so it'd meltdown when trying to install an update without using the system package manager.

[+] 5_minutes|8 years ago|reply
Have all my data on Dropbox with revisions activated, and having that backedup by Crashplan. You'll have double automated backups and 0 hassle managing it.
[+] iamcreasy|8 years ago|reply
I have a related question. I want to take backup of certain folders to a portable USB HDD every night. Can anyone recommend any simple solution for that?

I don't need encryption or any extraneous features. I just need the selected directories to get mirrored to a backup location.

Currently, I am using SyncToy by Microsoft, but I was looking for a cross platform solution.

[+] satai|8 years ago|reply
3-2-1 rule.

I would use time machine capsule and periodically (weekly?) connect an encrypted external drive and Borg backup there. Next week a second drive, third week the first one...

Always keep one of this drive off-site.

This is just one of many options how to get reasonably safe (I use an almost this one just deja-dup instead of time machine.)

[+] ottobonn|8 years ago|reply
I also use deja-dup but lately it seems to choke when trying to determine what to back up (I have about 175 GB of files, which doesn't seem outrageous). Have you had any issues with the speed? It could be that I have a long history saved on my backup drive and it's trying to apply too many incremental diffs.
[+] brandonhall|8 years ago|reply
I've used the following method for years and it's really simple. Get an external hard drive and partition it as needed. One for your Time Machine backup and another for data. Use Google Drive to mirror the data and use Arq as your Time Machine in the cloud.
[+] mattbillenstein|8 years ago|reply
I don't backup end-systems -- but I do have a directory with important data sync'd to several systems and the cloud using syncthing. The rest of the data I care about is in git -- everything else on the system is basically disposable.
[+] pfarnsworth|8 years ago|reply
I think Amazon Cloud Drive is $60/yr if you have Prime. You can hook up your account to your Synology NAS and have it automatically back things up as soon as you copy it over. Also Synology can encrypt it on the fly as well.