top | item 42985515

(no title)

jbay808 | 1 year ago

I was interested in this question so I trained NanoGPT from scratch to sort lists of random numbers. It didn't take long to succeed with arbitrary reliability, even given only an infinitesimal fraction of the space of random and sorted lists as training data. Since I can evaluate the correctness of a sort arbitrarily, I could be certain that I wasn't projecting my own beliefs onto its response, and reading more into the output than was actually there.

That settled this question for me.

discuss

order

dartos|1 year ago

I don’t really understand what you’re testing for?

Language, as a problem, doesn’t have a discrete solution like the question of whether a list is sorted or not.

Seems weird to compare one to the other, unless I’m misunderstanding something.

What’s more, the entire notion of a sorted list was provided to the LLM by how you organized your training data.

I don’t know the details of your experiment, but did you note whether the lists were sorted ascended or descended?

Did you compare which kind of sorting was most common in the output and in the training set?

Your bias might have snuck in without you knowing.

jbay808|1 year ago

> I don’t really understand what you’re testing for?

For this hypothesis: The intelligence illusion is in the mind of the user and not in the LLM itself.

And yes, the notion was provided by the training data. It indeed had to learn that notion from the data, rather than parrot memorized lists or excerpts from the training set, because the problem space is too vast and the training set too small to brute force it.

The output lists were sorted in ascending order, the same way that I generated them for the training data. The sortedness is directly verifiable without me reading between the lines to infer something that isn't really there.

IshKebab|1 year ago

A large number of commenters are under the illusion that LLMs are "just" stochastic parrots and can't generalise to inputs not seen in their training data. He was proving that that isn't the case.

tossandthrow|1 year ago

Commenter is merely saying that LLMs indeed are able to approximate arbitrary functions exemplified through sorting.

It is nothing new and has been well established in the literature since the 90s.

The shared article really is not worth the read and mostly uncovers an author who does not know what he write about.

manmal|1 year ago

Have you considered that the nature of numeric characters is just so predictable that they can be sorted without actually understanding their numerical value?

jbay808|1 year ago

Can you say more precisely what you mean?