(no title)
posnet | 7 months ago
Especially with Gemini Pro when providing long form textual references, providing many documents in a single context windows gives worse answers than having it summarize documents first, ask a question about the summary only, then provide the full text of the sub-documents on request (rag style or just simple agent loop).
Similarly I've personally noticed that Claude Code with Opus or Sonnet gets worse the more compactions happen, it's unclear to me whether it's just the summary gets worse, or if its the context window having a higher percentage of less relevant data, but even clearing the context and asking it to re-read the relevant files (even if they were mentioned and summarized in the compaction) gives better results.
zwaps|7 months ago
Long story short: Context engineering is still king, RAG is not dead
deadbabe|7 months ago
LLMs will need RAG one way or another, you can hide it from the user, but it still must be there.
tvshtr|7 months ago
risyachka|7 months ago
Xmd5a|7 months ago
Inviz|7 months ago
darepublic|7 months ago
irskep|7 months ago
The thing that would signal context rot is when you approach the auto-compact threshold. Am I thinking about this right?
0x457|7 months ago
bayesianbot|7 months ago
OccamsMirror|7 months ago
gonzric1|7 months ago
tough|7 months ago