Seems reductive. Some applications require higher context length or fast tokens/s. Consider it a multidimensional Pareto frontier you can optimize for.
It's not just that some absolutely require it, but a lot of applications hugely benefit from more context. A large part of LLM engineering for real world problems revolves around structuring the context and selectively providing the information needed while filtering out unneeded stuff. If you can just dump data into it without preprocessing, it saves a huge amount of development time.
Depending on the application, I think “without preprocessing” is a huge assumption here.
LLMs typically do a terrible job of weighting poor quality context vs high quality context and filling an XL context with unstructured junk and expecting it to solve this for you is unlikely to end well.
In my own experience you quickly run into jarring tangents or “ghosts” of unrelated ideas that start to shape the main thread of consciousness and resist steering attempts.
sigmoid10|3 months ago
cronin101|3 months ago
In my own experience you quickly run into jarring tangents or “ghosts” of unrelated ideas that start to shape the main thread of consciousness and resist steering attempts.