top | item 45121239

(no title)

charleshn | 6 months ago

> We cannot add more compute to a given compute budget C without increasing data D to maintain the relationship. > We must either (1) discover new architectures with different scaling laws, and/or (2) compute new synthetic data that can contribute to learning (akin to dreams).

Of course we can, this is a non issue.

See e.g. AlphaZero [0] that's 8 years old at this point, and any modern RL training using synthetic data, e.g. DeepSeek-R1-Zero [1].

[0] https://en.m.wikipedia.org/wiki/AlphaZero

[1] https://arxiv.org/abs/2501.12948

discuss

order

jeremyjh|6 months ago

AlphaZero trained itself through chess games that it played with itself. Chess positions have something very close to an objective truth about the evaluation, the rules are clear and bounded. Winning is measurable. How do you achieve this for a language model?

Yes, distillation is a thing but that is more about compression and filtering. Distillation does not produce new data in the same way that chess games produce new positions.

charleshn|6 months ago

You can have a look at the DeepSeek paper, in particular section "2.2 DeepSeek-R1-Zero: Reinforcement Learning on the Base Mode".

But generally the idea is that it's, you need some notion of reward, verifiers etc.

Works really well for maths, algorithms, amd many things actually.

See also this very short essay/introduction: https://www.jasonwei.net/blog/asymmetry-of-verification-and-...

That's why we have IMO gold level models now, and I'm pretty confident we'll have superhuman mathematics, algorithmic etc models before long.

Now domains which are very hard to verify - think e.g. theoretical physics etc - that's another story.

voxic11|6 months ago

Synthetic data is already widely used to do training in the programming and mathematics domains where automated verification is possible. Here is an example of an open source verified reasoning synthetic dataset https://www.primeintellect.ai/blog/synthetic-1

scotty79|6 months ago

Simple, you just need to turn language into a game.

You make models talk to each other, create puzzles for each other's to solve, ask each other to make cases and evaluate how well they were made.

Will some of it look like ramblings of pre-scientific philosophers? (or modern ones because philosophy never progressed after science left it in the dust)

Sure! But human culture was once there too. And we pulled ourselves out of this nonsense by the bootstraps. We didn't need to be exposed to 3 alien internet's with higher truth.

It's really a miracle that AIs got as much as they did from purely human generated mostly garbage we cared to write down.

cs702|6 months ago

> Of course we can, ... synthetic data ...

That's option (2) in the parent comment: synthetic data.