top | item 45291405

(no title)

rostayob | 5 months ago

I'm not fully up to date since we looked into this a few years ago, at the time the CERN deployments of Ceph were cited as particularly large examples and they topped out at ~30PB.

Also note that when I say "single deployment" I mean that the full storage capacity is not subdivided in any way (i.e. there are no "zones" or "realms" or similar concepts). We wanted this to be the case after experiencing situations where we had significant overhead due to having to rebalance different storage buckets (albeit with a different piece of software, not Ceph).

If there are EB-scale Ceph deployments I'd love to hear more about them.

discuss

order

mrngm|5 months ago

Ceph has opt-in telemetry since a couple of years. This dashboard[0] panel suggests there are about 4-5 clusters (that send telemetry) within the 32-64 PiB range.

It would be really interesting to see larger clusters join in on their telemetry as well.

[0] https://telemetry-public.ceph.com/d/ZFYuv1qWz/telemetry?orgI...

mgrandl|5 months ago

There are much larger Ceph clusters, but they are enterprise owned and not really publicly talked about. Sadly I can’t share what I personally worked on.

rostayob|5 months ago

The question is whether there are single Ceph deployments are that large. I believe Hetzner uses Ceph for its cloud offering, and that's probably very large, but I'd imagine that no single tenant is storing hundreds of PBs in it. So it's very easy to shard across many Ceph instances. In our use-case we have a single tenant which stores 100s of PBs (and soon EBs).