(no title)
sarkarsh | 18 hours ago
esafak's cache economics point is underrated. With prompt caching, verbose context that gets reused is basically free. If compression breaks cache continuity you might save tokens while spending more money.
The deeper issue is that most MCP tools do SELECT * when they should return summaries with drill-down. That's a protocol design problem, not a compression problem.
qeternity|18 hours ago
But it's not. It might be discounted cost-wise, however it will still degrade attention and make generation slower/more computationally expensive even if you have a long prefix you can reuse during prefill.