top | item 39120923

(no title)

res0nat0r | 2 years ago

The major issue out of the box vs any deduping backup software is that S3 doesnt support any deduplication. If you move or rename a 15GB file you're going to have to completely upload it again and also store a second copy and pay for it until your S3 bucket policy purges the previously uploaded file you've deleted. Also aws s3 sync is much slower since it has to iterate over all of the files to see if their size/timestamp has changed. Something like borgbackup is much faster as it uses smarter caching to skip unchanged directories etc.

discuss

65|2 years ago

It's possible to find probable duplicate files with the S3 CLI based on size and tags - I was working a script to do just that but I haven't finished it yet. Alternatively if you want exact backups of your computer you can use the --delete flag which will delete files in the bucket that aren't in the source.

I agree this is not the absolute most optimized solution but it does work quite well for me and is easily extendible with other scripts and S3 CLI commands. Theoretically if Borgbackup or Duplicity are backing up to S3 they're using all the same commands as the S3 CLI/SDK.

Besides, shell scripting is fun!

duskwuff|2 years ago

> Theoretically if Borgbackup or Duplicity are backing up to S3 they're using all the same commands as the S3 CLI/SDK.

They are not. Both Borg and Duplicity pack files into compressed, encrypted archives before uploading them to S3; "s3 sync" literally just uploads each file as an object with no additional processing.

sevg|2 years ago

If I have to choose between hacking together a bunch of shell scripts to do my deduplicated, end-to-end encrypted backups, vs using a popular open source well-tested off the shelf solution, I know which one I'm picking!