Show HN: Steganography in natural language using LLM logit-rank steering
2 points| shevis | 1 month ago |github.com
> subtext-codec is a proof-of-concept codec that hides arbitrary binary data inside seemingly normal LLM-generated text. It steers a language model's next-token choices using the rank of each token in the model's logit distribution. With the same model, tokenizer, prefix, and parameters, the process is fully reversible -- enabling text that reads naturally while secretly encoding bytes.
Basically, use the fact that LLMs learn a deterministic probability distribution over next-token generation to create seemingly innocuous ciphertext that is hard to detect.
anonymoushn|1 month ago