top | item 46688111

(no title)

ramraj07 | 1 month ago

The fundamental idea that modern LLMs can only ever remix, even if its technically true (doubt), in my opinion only says to me that all knowledge is only ever a remix, perhaps even mathematically so. Anyone who still keeps implying these are statistical parrots or whatever is just going to regret these decisions in the future.

discuss

omnicognate|1 month ago

Why doubt? Transformers are a form of kernel smoothing [1]. It's literally interpolation [2]. That doesn't mean it can only echo the exact items in its training data - generating new data items is the entire point of interpolation - but it does mean it's "remixing" (literally forming a weighted sum of) those items and we would expect it to lose fidelity when moving outside the area covered by those points - i.e. where it attempts to extrapolate. And indeed we do see that, and for some reason we call it "hallucinating".

The subsequent argument that "LLMs only remix" => "all knowledge is a remix" seems absurd, and I'm surprised to have seen it now more than once here. Humanity didn't get from discovering fire to launching the JWST solely by remixing existing knowledge.

[1] http://bactra.org/notebooks/nn-attention-and-transformers.ht...

[2] Well, smoothing/estimation but the difference doesn't matter for my point.

ramraj07|1 month ago

Its not clear to me that LLMs hallucinating is because of they are extrapolating beyond their training data. Is that proven? Or are you extrapolating?

Even acknowledging it is interpolation, models can extrapolate slightly without making things up, within the range where the model still applies. Whos to say what this range is for an LLM operating in thousand dimensional space? As far as I can tell the main limiters to LLM creativity are guardrails we put in place for safety and usefulness.

And what exactly is your proof that human ingenuity is not just pattern matching. Im sure a hypothesis can be put that fire was discovered by just adding up all known facts the people of those times knew and stumbling on something that put it all together. Sounds like knowledge remix + slight extrapolating to me.

mrbungie|1 month ago

> Anyone who still keeps implying these are statistical parrots or whatever is just going to regret these decisions in the future.

You know this is a false dichotomy right? You can treat and consider LLMs statistical parrots and at the same time take advantage of them.

ramraj07|1 month ago

Yes, but the immediate equivalent scenario to me is how people treated other people as slaves merely using them like machines. Sure you got use out of them, but was that the best use?

theshrike79|1 month ago

There are musicians who "remix" (sample) other artists music and make massive hits themselves.

Not every solution needs to be unique, in many cases "remixing" existing solutions in an unique way is better and faster.

pseudosavant|1 month ago

But all of my great ideas are purely from my own original inspiration, and not learning or pattern matching. Nothing derivative or remixed. /sarcasm

heavyset_go|1 month ago

Yeah, Yann LeCun is just some luddite lol

NitpickLawyer|1 month ago

I don't think he's a luddite at all. He's brilliant in what he does, but he can also be wrong in his predictions (as are all humans from time to time). He did have 3 main predictions in ~23-24 that turned out to be wrong in hindsight. Debatable why they were wrong, but yeah.

In a stage interview (a bit after the "sparks of agi in gpt4" paper came out) he made 3 statemets:

a) llms can't do math. They can trick us with poems and subjective prose, but at objective math they fail.

b) they can't plan

c) by the nature of their autoregressive architecture, errors compound. so a wrong token will make their output irreversibly wrong, and spiral out of control.

I think we can safely say that all of these turned out to be wrong. It's very possible that he meant something more abstract, and technical at its core, but in the real life all of these things were overcome. So, not a luddite, but also not a seer.

CuriouslyC|1 month ago

You don't understand Yann's argument. It's similar to Richard Sutton's, in that these things aren't thinking, they're emulating thinking, and the weak implicit world models that get built in the weights are insufficient for true "AGI."

This is orthogonal to the issue of whether all ideas are essentially "remixes." For the record I agree that they are.