top | item 29537947

(no title)

greypowerOz | 4 years ago

>"Also, consider that if you can only read at 100 MB/s off a mechanical drive but your CPU can decompress data at ~500 MB/s then the mechanical drive is able to provide 5x the throughput you'd otherwise expect thanks to compression."

I'd not really thought of that aspect before... My old brain is hard-coded to save cpu cycles ... Time to change my ways :)

discuss

order

kzrdude|4 years ago

This is an older chart of ZSTD But it is an inspiring visualisation - compression algorithm "times" transmission bandwidth => transmission speed.

http://fastcompression.blogspot.com/2015/01/zstd-stronger-co...

Taken from the fastcompression blog - where one could follow ZSTD's author since before ZSTD was even conceived.

"Conveniently" enough the author of the blog has written both ZSTD and LZ4, which top the chart for their respective link speed domains. (2015 data - things have improved in both ZSTD and others since then.)

flohofwoe|4 years ago

This was already a common technique back in the 90's for loading asset data in games and even more important than today with fast SSDs. Instead of many small files, the asset data is loaded from a single big compressed archive file, this works around the terrible small-file-performance (mainly in Windows file systems), and it speeds up loading because loading the compressed data and decompressing is much faster than loading the decompressed data directly from the slow disc.

Someone|4 years ago

A variation on this also still is extremely common in the case where the ‘file system’ is a web server and the program accessing it a web browser (https://css-tricks.com/css-sprites/)

In that case, there typically isn’t additional explicit compression (1). The main gain is in decreasing the number of http requests.

(1) the image itself may have inherent compression, and that may be improved by combining images with similar content, and the web server may be configured to use compression, but the first typically isn’t a big win, and the second is independent from this strategy.

magicalhippo|4 years ago

Everything old is new again, so maybe you're just in time. Consider we have NVMe drives which can read and write 4-5 GB/s[1], that whole equation changes again...

[1]: https://www.anandtech.com/bench/SSD21/3017

1_player|4 years ago

zstd at compression level 1 can do ~2GB/s per core, and as time goes on and processors get more and more cores, compressing data by default is a valid proposition.

In fact, if you install Fedora 35 on btrfs, zstd:1 is enabled by default, using fs-level heuristics to decide when and when not to compress, reducing write amplification on SSD drives and gaining some space for free with negligible performance impact, which is nice.

My 8GB ~/src directory on encrypted btrfs on NVMe uses 6GB on disk and I can easily saturate the link while reading from it. Computers are plenty fast.

mdp2021|4 years ago

There exist network connections at 80Kb/s. Still data repositories.

bob1029|4 years ago

This idea is very powerful. Creating small compressed batches of transactions also allows you to write to disk in terms stated as "Transactions per block IO". When dealing with storage media that can wear out at the block grains, this can save substantial device lifetime.