(no title)
adrn10
|
4 years ago
So basically, we are a hosting association that wanted to put their servers at home _and_ sleep at night.
This demanded inter-home redundancy of our data. But none of the existing solutions (MinIO, Ceph...) are designed for high inter-node latency.
Hence Garage! Your cheap and proficient object store: designed to run on a toaster, through carrier-pigeon-grade networks, while still supporting a tons of workloads (static websites, backups, Nextcloud... you name it)
RobLach|4 years ago
mro_name|4 years ago
House-keeping UX is key for self-hosting by laypersons.
ddorian43|4 years ago
superboum|4 years ago
- Garage is easier to deploy and to operate: you don't have to manage independent components like the filer, the volume manager, the master, etc. It also seems that a bucket must be pinned to a volume server on SeaweedFS. In Garage, all buckets are spread on the whole cluster. So you do not have to worry that your bucket fills one of your volume server.
- Garage works better in presence of crashes: I would be very interested by a deep analysis of Seaweed "automatic master failover". They use Raft, I suppose either by running an healthcheck every second which lead to data loss on a crash, or sending a request for each transaction, which creates a huge bottleneck in their design.
- Better scalability: because there is no special node, there is no bottlenecks. I suppose that with SeaweedFS, all the requests have to pass through the master. We do not have such limitations.
As a conclusion, we choose a radically different design with Garage. We plan to do a more in-depth comparison in the future, but even today, I can say that if we implement the same API, our radically different designs lead to radically different properties and trade-off.
lxpz|4 years ago
To me, the two key differentiators of Garage over its competitors are as follows:
- Garage contains an evolved metadata system that is based on CRDTs and consistent hashing inspired by Dynamo, solidly grounded in distributed system's theory. This allows us to be very efficient as we don't use Raft or other consensus algorithms between nodes, and we also do not rely on an external service for metadata storage (Postgres, Cassandra, whatever) meaning we don't pay an additionnal communication penalty.
- Garage was designed from the start to be multi-datacenter aware, again helped by insights from distributed system's theory. In practice we explicitly chose against implementing erasure coding, instead we spread three full copies of data over different zones so that overall availability is maintained with no degradation in performance when one full zone goes down, and data locality is preserved at all locations for faster access (in the case of a system with three zones, our ideal deployment scenario).
adrn10|4 years ago