top | item 45210353

Show HN: FSP2 Tested on excerpt "Romeo and Juliet" impressive compresion results

2 points| Forgret | 5 months ago

Hi HN, I want to share my updated FSP (Find Similar Patterns) v2 text compression algorithm. I tested it on a non-trivial excerpt from Romeo and Juliet, and it achieved impressive results: original size 437 bytes, compressed size 358 bytes, compression ratio 1.22. Unlike traditional methods like LZMA or Huffman, FSP v2 searches for repeating 3–5 character patterns, storing references (REF) alongside literal characters (LITERAL). This allows it to compress real-world text, maintain lossless decompression, and achieve compressed sizes smaller than the original. The algorithm works on any byte stream or text and can scale to larger files, potentially outperforming classical compression on texts with repetitive or near-repetitive patterns. Code and implementation details are available upon request.

GitHub: https://github.com/Ferki-git-creator/fsp

Website(more info): https://ferki-git-creator.github.io/fsp/

If I made a mistake somewhere, please tell me.

4 comments

southwindcg|5 months ago

Before we say too much about the performance of your algorithm, encoding/decoding speed and memory use must be considered, especially with very large inputs.

Note that on this particular small sample of text, Zstandard `zstd -13` compresses it to 288 bytes, and with default settings, 292 bytes. Brotli using default settings compresses it to 236 bytes.

Forgret|5 months ago

Thanks, you’re absolutely right — performance needs to be tested on large inputs with proper speed and memory profiling. I’ll run FSP on bigger datasets and compare it directly with zstd, brotli, gzip, etc. If needed, I’ll improve the algorithm to reduce overhead and make it scale better. This was just an early proof-of-concept, but I agree the next step is serious benchmarking.