(no title)
DefineOutside | 6 months ago
https://github.com/UltraVanilla/paper-zstd/blob/main/patches...
from the author of this patch on discord - the level 9 for compression isn't practical and is too slow for a real production server but it does show the effectiveness of zstd with a shared dictionary.
So you start off with a 755.2 MiB world (in this test, it is a section of an existing DEFLATE-compressed world that has been lived in for a while). If you recreate its regions it will compact it down to 695.1 MiB
You set region-file-compression=lz4 and run --recreateRegionFiles and it turns into a 998.9 MiB world. Makes sense, worse compression ratios but less CPU is what mojang documented in the changelog. Neat, but I'm confused as to what the benefits are as I/O increasingly becomes the more constrained thing nowadays. This is just a brief detour from what I'm really trying to test
You set region-file-compression=none and it turns into a 3583.0 MiB world. The largest region file in this sample was 57 MiB
Now, you take this world, and compress each of the region files individually using zstd -9, so that the region files are now .mca.zst files. And you get a world that is 390.2 MiB
lordpipe|6 months ago
I don't remember the exact compression ratios for the dictionary solution in that repo, but it wasn't quite as impressive (IIRC around a 5% reduction compared to non-dictionary zstd at the same level). And the padding inherent to the region format takes away a lot of the ratio benefit right off the bat, though it may have worked better in conjunction with the PaperMC SectorFile proposal, which has less padding, or by rewriting the storage using some sort of LSM tree library that performs well at compactly storing blobs of varying size. I've dropped the dictionary idea for now, but it definitely could be useful. More research is needed.
masklinn|6 months ago
Might make sense if the region files are on a fast SSD and the server is more CPU-constrained? I assume the server reads from and writes to the region files during activity, a 3.5x increase in IO throughput at very little CPU cost (both ways) is pretty attractive. IIRC at lower compression levels deflate is about an order of magnitude more expensive than lz4.
zstd --fast is also quite attractive, but I'm always confused as to what the level of parallelism is in benchmarks, as zstd is multithreaded by default and benchmarks tend to show wallclock rather than CPU seconds.
lordpipe|6 months ago
I wrote that when the feature had just come out. Now it's been a bit since Minecraft started natively supporting the LZ4 chunk compression option. It seems safe to say that this tradeoff does in fact make sense, even when the CPU is quite powerful. Several servers have adopted it and have seen decent improvements.
adgjlsfhk1|6 months ago
immibis|6 months ago