top | item 39235767

(no title)

4by4by4 | 2 years ago

We tested both SeaweedFS and Min.io for cheaply (HDD) storing > 100TB of audio data.

Seaweed had much better performance for our use case.

discuss

order

Scaevolus|2 years ago

Do you wish it supported Erasure Coding for lower disk usage, or is your workload such that the extra spindles from replication are useful?

4by4by4|2 years ago

That would be nice and that’s why we first tried MinIO.

But with MinIO and erasure coding a single PUT results in more IOPS and we saw lower performance.

Also, expanding MinIO must be done in increments of your original buildout which is annoying. So if you start with 4 servers and 500TB, they recommend you expand by adding another 4 servers with 500TB at least.

bomewish|2 years ago

Forgive my ignorance but why is this preferable to a big ZFS pool?

chaxor|2 years ago

I could be wrong here, but I believe this (ceph, et al) is the answer to the question: > """But what if I don't have a JBOD of 6x18TB hard drives with good amount of ECC RAM for ZFS? What if I have 3 raspberry pi 4's, at different houses with 3x 12TB externals on them, and 2 other computers with 2x 4TB externals on them, and I want to use that all together with some redundancy/error checking?" That would give (3x3x12)+(2x2x4)=124 TB of storage, vs 108TB in the ZFS single box case (of raw storage).

If you could figure out the distributed part (and inconsistency in disk size and such), then this is a very nice system to have.

chillfox|2 years ago

Because you need an S3 compatible API?

I use ZFS for most of my things, but I have yet to find a good way of just sharing a ZFS dataset over S3.

4by4by4|2 years ago

Not the only reason, but we have a distributed workload so HTTP is a better protocol than NFS for our use case.

riku_iki|2 years ago

its distributed: will survive if your server dies..

erikaww|2 years ago

Any hiccups?

Drop in S3 compatibility with much better performance would be insane

4by4by4|2 years ago

Setup is a little obscure, but the developer is responsive on Slack and GH.

We are only a couple months in and haven’t had to add to our cluster yet, storing about 250TB, so it’s still early for us. Promising so far and hardware has already paid for itself.