top | item 41601085

(no title)

plewd | 1 year ago

Is LeCun's Law even a thing? Searching up for it doesn't yield many results, except for a HN comment where it has a different definition. I guess it could be from some obscure paper, but with how poorly it's documented it seems weird to bring it up in this context.

discuss

order

YeGoblynQueenne|1 year ago

I think the OP may be referring to this slide that Yann LeCun has presented on several occasions:

https://youtu.be/MiqLoAZFRSE?si=tIQ_ya2tiMCymiAh&t=901

To quote from the slide:

  * Probability e that any produced token takes us outside the set of correct answers
  * Probability that answer of length n is correct
  * P(correct) = (1-e)^n
  * This diverges exponentially
  * It's not fixable (without a major redesign)

atq2119|1 year ago

Doesn't that argument make the fundamentally incorrect assumption that the space of produced output sequence has pockets where all output sequence with a certain prefix are incorrect?

Design your output space in such way that every prefix has a correct completion and this simplistic argument no longer applies. Humans do this in practice by saying "hold on, I was wrong, here's what's right".

Of course, there's still a question of whether you can get the probability mass of correct outputs large enough.

sharemywin|1 year ago

Wouldn't this apply to all prediction machines that make errors.

Humans make bad predictions all the time but we still seem to manage to do some cool stuff here and there.

part of an agents architecture will be for it to minimize e and then ground the prediction loop against a reality check.

making LLMs bigger gets you a lower e with scale of data and compute but you will still need it to check against reality. test time compute also will play a roll as it can run through multiple scenarios and "search" for an answer.

roboboffin|1 year ago

Is this similar to the effect that I have seen when you have two different LLMs talking to each other, they tend to descend into nonsense ? A single error in one of the LLM's output and that then pushes the other LLM out of distribution.

I kind of oscillatory effect when the train of tokens move further and further out of the distribution of correct tokens.

hackerlight|1 year ago

It's quite fitting that the topic of this thread is self-correction. Self-correction is a trivial existence proof that refutes what LeCun is saying, because all the LLM has to say is "I made a mistake, let me start again".

ziofill|1 year ago

Doesn’t this assume that the probability of a correct answer is iid? It can’t be that simple.

littlestymaar|1 year ago

> * P(correct) = (1-e)^n * This diverges exponentially

I don't get it, 1-e is between 0 and 1, so (1-e)^n converge to zero. Also, a probability cannot diverge since it's bounded by 1!

I think the argument is that 1 - e^n converges to 1, which is what the law is about.

slashdave|1 year ago

Simplistic, since it assumes probabilities are uncorrelated, when they clearly aren't. Also, there are many ways of writing the correct solution to a problem (you do not need to replicated an exact sequence of tokens).

vjerancrnjak|1 year ago

“Label bias” or “observation bias” a phenomenon where going outside of the learned path lives little room for error correction. Lecun talks about the lack of joint learning in LLMs.

whimsicalism|1 year ago

It’s a thing in that he said it but it’s not an actual law and it has several obvious logical flaws. It applies just as equally to human utterances.