top | item 43201284

(no title)

tetron | 1 year ago

Was curious how they get such performance with a FUSE based design. It seems that they sort of cheat, FUSE is used to manage metadata but to get high performance you have to link in the C++ client library and do all your reads and writes through that. So it isn't general purpose, you have to modify your application to take advantage of it. Still, that's a clever trick, and makes me wonder if there's a LD_PRELOAD strategy that could generalize.

discuss

grohan|1 year ago

They appear to have Python bindings which seems reasonable from an API / usability perspective? https://github.com/deepseek-ai/smallpond

In terms of fast FUSE - also my first question, appears to be`io_uring` + FUSE :)

https://github.com/deepseek-ai/3FS/blob/main/src/lib/api/Usr...

amelius|1 year ago

Why is FUSE that much slower than providing your own read/write functions? I get that it has to go through the kernel, but the operations are on entire blocks and network should be the bottleneck by far (and disk/main memory should be a bottleneck if the data is local).

vlovich123|1 year ago

You have to bounce through the kernel back out to use space. The number of syscalls is quite high. In many cases this is mitigated somewhat by the page cache making reads cheaper, but that’s explicitly an anti design here.

I believe there’s work to minimize this using io_uring so that you can talk to the fuse driver without the kernel being in the middle, but that work isn’t ready last time I checked.

For what it’s worth at Palm we had a similar problem because our applications were stored compressed but exposed through fuse uncompressed, instead of O_DIRECT I just did an fadvise to dump the cache after a read. Not as high throughput but the least risky change to get the same effect.