The Wikipedia data dumps [0] are multistream bz2. This makes them relatively easy to partially ingest, and I'm happy to be able to remove the C dependency from the Rust code I have that deals with said dumps.
bzip2 is still pretty good if you want to optimize for:
- better compression ratio than gzip
- faster compression than many better-than-gzip competitors
- lower CPU/RAM usage for the same compression ratio/time
This is a niche, but it does crop up sometimes. The downside to bzip2 is that it is slow to decompress, but for write-heavy workloads, that doesn't matter too much.
So? If I need to consume a resource compressed using bz2, I'm not just going to sit around and wait for them to use zstd. I'm going to break out bz2. If I can use a modern rewrite that's faster, I'll take every advantage I can get.
Philpax|8 months ago
[0]: https://meta.wikimedia.org/wiki/Data_dump_torrents#English_W...
jeffbee|8 months ago
0x457|8 months ago
ben-schaaf|8 months ago
kbolino|8 months ago
Twirrim|8 months ago