Just a reminder, as this is sometimes confusing - a plain old rsync.net account is running on a ZFS filesystem and has all of the benefits that that entails (point in time snapshots, checksummed integrity, CoW efficiency, etc.) but they do not have their own zpool to themselves.
If you want to zfs send to an rsync.net account you need to have an account enabled for that which is the same price, but has a higher (4TB) minimum account size.
You've discussed why a zpool account has a relatively high minimum in the past [1]. Basically because you need a persistent isolated vm/bhyve with sshd listening on its own ip and separate zfs and zpool for each customer.
It's still fuzzy to me specifically why a dedicated 'vm' is necessary. You don't need a dedicated ssh ip for normal 'shell' accounts, and certainly you still provide security and privacy for these customers. Is it because zfs doesn't have fine-grained permissions to allow doing zfs send/recv actions on one's own datasets/volumes without also giving access to other customers' data? Or is it because send/recv workloads just consume more compute resources in general?
I just set up zrepl yesterday and this was going to be my next task. Super glad to find this here. I already have a borg-enabled account with you: Is it possible for a single account to do both borg and zfs send?
To be clear, rsync.net zfs should work with any/all shell or script-based ZFS replication tools. The basic documentation for zrepl here is good, but you can also use sanoid/syncoid, zsnapzend, etc.
I'm curious how folks who use this deal with that first upload on a “normal” ISP connection.
I tried rsync for this purpose in the past. Our main office gets 150 Mbps down / 20 up. I tried uploading the initial 1TB snapshot and after a week it had not finished. Meanwhile a fair amount of that original snapshot becomes stale during that time.
Are you just supposed to start up these hourly snapshots and hope everything catches up with itself eventually?
Not sure about zrepl, but sanoid/syncoid will keep a separate snapshot for each replication target that has a lifetime separate from your usual expiration policy. So say you setup sanoid to keep 24 "hourly" snapshots, but the initial replication w/ syncoid takes 36 hours. You'd be left with a "@syncoid_HOSTNAME_ISO8601" snapshot, a 12 hour gap, followed by your 24 hourly snapshots. So that snapshot will hold that 12 hours worth of block churn, to allow for incremental sends, until your ISP is able to catch up.
The other option, if you're colocating, is to send a seed drive ahead to the datacenter. (I think there was a startup on here a while back where they'd basically colo your drives in their own JBODs, and then charge you a nominal monthly fee for a VPS w/ those drives passed through as a zpool.) You might pay some nominal fee for remote hands, but it beats waiting for terabytes of data to squeeze through your local cableco's wildly asymmetric pipe.
Back in the day I managed a small office server, backing up via a particularly slow ADSL line. It took about a week to do the initial sync, so I just let it run and in the meantime backed up to an external disk every couple of days.
Once it was up and running most snapshots took a few minutes to sync, always finished before the morning anyway.
Definite +1 to rsync.net, this was >15 years ago but it was always 100% solid, I don't think I ever had any issues. It's nice to see they're still doing the same thing and haven't bloated it with crap!
zrepl is awesome!! I have been using it for some time with excellent results to manage a lot (for my org anyways) of backups. Glad to hear rsync is making it available
[+] [-] rsync|3 years ago|reply
If you want to zfs send to an rsync.net account you need to have an account enabled for that which is the same price, but has a higher (4TB) minimum account size.
[+] [-] infogulch|3 years ago|reply
It's still fuzzy to me specifically why a dedicated 'vm' is necessary. You don't need a dedicated ssh ip for normal 'shell' accounts, and certainly you still provide security and privacy for these customers. Is it because zfs doesn't have fine-grained permissions to allow doing zfs send/recv actions on one's own datasets/volumes without also giving access to other customers' data? Or is it because send/recv workloads just consume more compute resources in general?
[1]: https://news.ycombinator.com/item?id=27994369#28000100
[+] [-] waynesonfire|3 years ago|reply
Cold storage has a complicated pricing model because it's al-a-carte, but offsite backups tend to be write-once-read-never.
$0.0036 per GB / Month for aws $0.007 per GB-month for storage for gcs
[+] [-] ssl232|3 years ago|reply
[+] [-] LiamMcCalloway|3 years ago|reply
[+] [-] rsync|3 years ago|reply
The author of 'zrepl' is Christian Schwarz (@problame) - sorry for the mixup :)
[+] [-] secabeen|3 years ago|reply
[+] [-] infogulch|3 years ago|reply
I found an old 2017 issue about one user's reasoning when deciding between sanoid and zsnapsend: https://github.com/jimsalterjrs/sanoid/issues/102
[+] [-] velcrovan|3 years ago|reply
I tried rsync for this purpose in the past. Our main office gets 150 Mbps down / 20 up. I tried uploading the initial 1TB snapshot and after a week it had not finished. Meanwhile a fair amount of that original snapshot becomes stale during that time.
Are you just supposed to start up these hourly snapshots and hope everything catches up with itself eventually?
[+] [-] drbawb|3 years ago|reply
The other option, if you're colocating, is to send a seed drive ahead to the datacenter. (I think there was a startup on here a while back where they'd basically colo your drives in their own JBODs, and then charge you a nominal monthly fee for a VPS w/ those drives passed through as a zpool.) You might pay some nominal fee for remote hands, but it beats waiting for terabytes of data to squeeze through your local cableco's wildly asymmetric pipe.
[+] [-] eggfriedrice|3 years ago|reply
Once it was up and running most snapshots took a few minutes to sync, always finished before the morning anyway.
Definite +1 to rsync.net, this was >15 years ago but it was always 100% solid, I don't think I ever had any issues. It's nice to see they're still doing the same thing and haven't bloated it with crap!
[+] [-] mdtancsa|3 years ago|reply