top | item 38389729

(no title)

wbhart | 2 years ago

How on earth could you evaluate the scaling path with too little information. That's my point. You can't possibly know that a technology can solve a given kind of problem if it can only so far solve a completely different kind of problem which is largely unrelated!

Saying that performance on grade-school problems is predictive of performance on complex reasoning tasks (including theorem proving) is like saying that a new kind of mechanical engine that has 90% efficiency can be scaled 10x.

These kind of scaling claims drive investment, I get it. But to someone who understands (and is actually working on) the actual problem that needs solving, this kind of claim is perfectly transparent!

discuss

order

uoaei|2 years ago

Any claims of objective, quantitative measurements of "scaling" in LLMs is voodoo snake oil when measured against some benchmarks consisting of "which questions does it answer correctly". Any machine learning PhD will admit this, albeit only in a quiet corner of a noisy bar after a few more drinks than is advisable when they're earning money from companies who claim scaling wins on such benchmarks.

dwaltrip|2 years ago

For the current generative AI wave, this is how I understand it:

1. The scaling path is decreased val/test loss during training.

2. We have seen multiples times that large decreases in this loss have resulted in very impressive improvements in model capability across a diverse set of tasks (e.g. gpt-1 through gpt-4, and many other examples).

3. By now, there is tons of robust data demonstrating really nice relationships between model size, quantity of data, length of training, quality of data, etc and decreased loss. Evidence keeps building that most multi-billion param LLMs are probably undertrained, perhaps significantly so.

4. Ergo, we should expect continued capability improvement with continued scaling. Make a bigger model, get more data, get higher data quality, and/or train for longer and we will see improved capabilities. The graphs demand that it is so.

---

This is the fundamental scaling hypothesis that labs like OpenAI and Anthropic have been operating off of for the past 5+ years. They looked at the early versions of the curves mentioned above, extended the lines, and said, "Huh... These lines are so sharp. Why wouldn't it keep going? It seems like it would."

And they were right. The scaling curves may break at some point. But they don't show indications of that yet.

Lastly, all of this is largely just taking existing model architectures and scaling up. Neural nets are a very young technology. There will be better architectures in the future.

jacquesm|2 years ago

We're at the point now where the harder problem is obtaining the high quality data you need for the initial training in sufficient quantities.

neilk|2 years ago

I didn’t say “certain success”, I said “interesting”

OOPMan|2 years ago

Honestly, OpenAI seem more like a cult that a company to me.

The hyperbole that surrounds them fits the mould nicely.

hutzlibu|2 years ago

They did build the most advanced LLM tool, though.