top | item 46739209

(no title)

westoncb | 1 month ago

Interesting that compaction is done using an encrypted message that "preserves the model's latent understanding of the original conversation":

> Since then, the Responses API has evolved to support a special /responses/compact endpoint (opens in a new window) that performs compaction more efficiently. It returns a list of items (opens in a new window) that can be used in place of the previous input to continue the conversation while freeing up the context window. This list includes a special type=compaction item with an opaque encrypted_content item that preserves the model’s latent understanding of the original conversation. Now, Codex automatically uses this endpoint to compact the conversation when the auto_compact_limit (opens in a new window) is exceeded.

discuss

order

icelancer|1 month ago

Their compaction endpoint is far and away the best in the industry. Claude's has to be dead last.

nubg|1 month ago

Help me understand, how is a compaction endpoint not just a Prompt + json_dump of the message history? I would understand if the prompt was the secret sauce, but you make it sound like there is more to a compaction system than just a clever prompt?

swalsh|1 month ago

Is it possible to use the compactor endpoint independently? I have my own agent loop i maintain for my domain specific use case. We built a compaction system, but I imagine this is better performance.

__jl__|1 month ago

Yes you can and I really like it as a feature. But it ties you to OpenAI…

westoncb|1 month ago

I would guess you can if you're using their Responses api for inference within your agent.

jswny|1 month ago

How does this work for other models that aren’t OpenAI models

westoncb|1 month ago

It wouldn’t work for other models if it’s encoded in a latent representation of their own models.