(no title)
sdesol | 5 months ago
Given how fast interference has become and given current supported context window sizes for most SOTA models, I think summarizing and having the LLM decide what is relevant is not that fragile at all for most use cases. This is what I do with my analyzers which I talk about at https://github.com/gitsense/chat/blob/main/packages/chat/wid...
adastra22|5 months ago
sdesol|5 months ago
If you take into consideration the post analysis process, which is what inference is trying to solve, is it an order of a magnitude slower?
9rx|5 months ago