I built a CLI tool in Rust that intercepts OpenAI’s streaming API and transforms every other token in real-time. You can reverse, uppercase, mock, or add noise, all live, as the model streams.
Why?
> Most LLM work assumes prompt full response.
> But what happens when you break the stream mid-flight?
This tool lets you:
- Intervene at the token level while the model responds
- Study how LLMs degrade semantically with corrupted output
- Do real-time interpretability research (token dependency, causal flow)
- Play with creative transformations in generative workflows
Tech:
- Written in Rust
- Streams directly from OpenAI’s chat API
- Fully async, low-latency, ~10k+ tokens/sec
- Works with any OpenAI model (e.g. GPT-3.5, GPT-4)
Shmungus|7 months ago
Why?
> Most LLM work assumes prompt full response. > But what happens when you break the stream mid-flight?
This tool lets you: - Intervene at the token level while the model responds - Study how LLMs degrade semantically with corrupted output - Do real-time interpretability research (token dependency, causal flow) - Play with creative transformations in generative workflows
Example:
Use cases include: - Token-level adversarial testing - Semantic robustness analysis - Experimental prompt steering - Human-AI token collaboration
Tech: - Written in Rust - Streams directly from OpenAI’s chat API - Fully async, low-latency, ~10k+ tokens/sec - Works with any OpenAI model (e.g. GPT-3.5, GPT-4)
Install: ```bash git clone https://github.com/yourusername/every-other-token-rust.git cd every-other-token cargo build --release