top | item 36556748

Ask HN: What's the practical use of larger context LLMs?

3 points| stevemadere | 2 years ago

I see that lots of folks are working on building LLMs that can handle more context without breaking the bank on GPU.

Is there a real practical reason for that right now or is it just something that everybody agrees is obvious without economic justification?

5 comments

ftxbro|2 years ago

So they have had LLMs with small contexts like one or two words or a dozen letters for a long time, ever since like Laplace or Shannon or Markov. They were called Markov chains. No one really guessed this (although it was known to be theoretically possible in the sense of ai-completeness), but it turns out that longer ones turn out to even in practice unlock so many cognitive capabilities bordering on superhuman. If this is the main difference between the Markov chains that they have been using for autocomplete for decades versus the ones that will beat you at the GREs or the bar exams or every AP test, then it is natural they are curious what happens when they make the context even longer.

stevemadere|2 years ago

No specific practical problems though? Looks to me a lot like "it's amazing. we want more amazing" rather than "if we had it, we could solve this specific practical problem people have been wanting to solve for a long time without considering a LLM as a possible solution"

unknown|2 years ago

[deleted]

seanthemon|2 years ago

Longer context means more memory, effectively a longer history the LLM remembers. One issue i'm having is say functions works wonderfully, but context window is tight even with 16k tokens, with a bigger context, sky is the limit.

stevemadere|2 years ago

So specifically, you are saying that LLM coding assistant currently gets confused when working on a large source file but if it had room for more context, you could get better help in writing code because it would have understanding of the entire module. Correct?