top | item 42869501

(no title)

riantogo | 1 year ago

Why would it cast any doubt? If you can use o1 output to build a better R1. Then use R1 output to build a better X1... then a better X2.. XN, that just shows a method to create better systems for a fraction of the cost from where we stand. If it was that obvious OpenAI should have themselves done. But the disruptors did it. It hindsight it might sound obvious, but that is true for all innovations. It is all good stuff.

discuss

Imnimo|1 year ago

I think it would cast doubt on the narrative "you could have trained o1 with much less compute, and r1 is proof of that", if it turned out that in order to train r1 in the first place, you had to have access to bunch of outputs from o1. In other words, you had to do the really expensive o1 training in the first place.

(with the caveat that all we have right now are accusations that DeepSeek made use of OpenAI data - it might just as well turn out that DeepSeek really did work independently, and you really could have gotten o1-like performance with much less compute)

deepGem|1 year ago

From the R1 paper

In this study, we demonstrate that reasoning capabilities can be significantly improved through large-scale reinforcement learning (RL), even without using supervised fine-tuning (SFT) as a cold start. Furthermore, performance can be further enhanced with the inclusion of a small amount of cold-start data

Is this cold start data what OpenAI is claiming their output ? If so what's the big deal ?

manquer|1 year ago

> you had to do the really expensive o1 training in the first place

It is no better for OpenAI in this scenario either, any competitor can easily copy their expensive training without spending the same, i.e. there is a second mover advantage and no economic incentive to be the first one.

To put it another way, the $500 Billion Stargate investment will be worth just $5Billion once the models become available for consumption, because it only will take that much to replicate the same outcomes with new techniques even if the cold start needed o1 output for RL.

MrLeap|1 year ago

o1 wouldn't exist without the combined compute of every mind that led to the training data they used in the first place. How many h100 equivalents are the rolling continuum of all of human history?

vkou|1 year ago

If OpenAi had to account for the cost of producing all the copyrighted material they trained their LLM on, their system would be worth negative trillions of dollars.

Let's just assume that the cost of training can be externalized to other people for free.

hmottestad|1 year ago

At the pace that DeepSeek is developing we should expect them to surpass OpenAI in not that long.

The big question really is, are we doing it wrong, could we have created o1 for a fraction of the price. Will o4 cost less to train than o1 did?

The second question is naturally. If we create a smarter LLM, can we use it to create another LLM that is even smarter?

It would have been fantastic if DeepSeek could have come out with an o3 competitor before o3 even became publicly available. That way we would have known for sure that we’re doing it wrong. Cause then either we could have used o1 to train a better AI or we could have just trained in a smarter and cheaper way.

cherry_tree|1 year ago

> I think it would cast doubt on the narrative "you could have trained o1 with much less compute, and r1 is proof of that"

Whether or not you could have, you can now.

SpaceManNabs|1 year ago

My question is if deepseek r1 is just a distilled o1, i wonder if you can build a fine tuned r1 through distillation without having to fine tune o1.

zombiwoof|1 year ago

Exactly. They piggybacked of lots of compute and used less. There still is a total sum of a massive amount of compute

philipwhiuk|1 year ago

You mean to create an apple pie from scratch you first have to invent the universe?

rockemsockem|1 year ago

I think the prevailing narrative ATM is that DeepSeek's own innovation was done in isolation and they surpassed OpenAI. Even though in the paper they give a lot of credit to Llama for their techniques. The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.

All of this should have been clear anyway from the start, but that's the Internet for you.

joe_the_user|1 year ago

The idea that they used o1's outputs for their distillation further shows that models like o1 are necessary.

Hmm, I think the narrative of the rise of LLMs is that once the output of humans has been distilled by the model, the human isn't necessary.

As far as I know, DeepSeek adds only a little to the transformers model while o1/o3 added a special "reasoning component" - if DeepSeek is as good as o1/o3, even taking data from it, then it seems the reasoning component isn't needed.

aprilthird2021|1 year ago

> the prevailing narrative ATM is that DeepSeek's own innovation was done in isolation and they surpassed OpenAI

I did not think this, nor did I think this was what others assumed. The narrative, I thought, was that there is little point in paying OpenAI for LLM usage when a much cheaper, similar / better version can be made and used for a fraction of the cost (whether it's on the back of existing LLM research doesn't factor in)

hmmm-i-wonder|1 year ago

>shows that models like o1 are necessary.

But HOW they are necessary is the change. They went from building blocks to stepping stones. From a business standpoint that's very damaging to OAI and other players.

KingOfCoders|1 year ago

OpenAI couldn't do it, when the high cost of training and access to GPUs is their competitive advance against startups, they can't admit that it does not exist.

patcon|1 year ago

Are we it rediscovering the evolutionary benefit of progeny (from an information theoretic lens)?

And is this related to the lottery ticket hypothesis?

https://arxiv.org/pdf/1803.03635.pdf

herodoturtle|1 year ago

Thanks for the insightful comment.

I have a question (disclaimer: reinforcement learning noob here):

Is there a risk of broken telephone with this?

Kinda like repeatedly compressing an already compressed image eventually leads to a fuzzy blur.

If that is the case then I’m curious how this is monitored and / or mitigated.

ospray|1 year ago

They did do that themselves it's called o3.

RHSman2|1 year ago

When will over training happen on the melange of models at scale? And will AGI only ever be an extension of this concept?

That is where artificial intelligence is going. Copy things from other things. Will there be a AI Eureka moment where it deviates and knows where and why the reason it is wrong?

indymike|1 year ago

Bad things happen in tech when you don't do the disrupting yourself.

anothernewdude|1 year ago

If they're training R1 on o1 output on the benchmarks - then I don't trust those benchmarks results for R1. It means the model is liable to be brittle, and they need to prove otherwise.

dontreact|1 year ago

Is there any evidence R1 is better than O1?

It seems like if they in fact distilled then what we have found is that you can create a worse copy of the model for ~5m dollars in compute by training on its outputs.

iforgot22|1 year ago

"Then use R1 output to build a better X1" is the part I'm not sure about. Is X1 going to actually be better than R1?

qwertox|1 year ago

They're standing on the shoulders of giants, not only in terms of re-using expensive computing power almost for free by using the outputs of expensive models. It's a bit of a tradition in that country, also in manufacturing.

unreal37|1 year ago

I thought OpenAI GPT took Wikipedia and the content of every book as inputs to train their models?

Everyone is standing on the shoulders of giants.

bigfudge|1 year ago

How do you think manufacturing in the US got started? Everyone is on someone’s shoulders.

dartos|1 year ago

What does “better” really even mean here?

Better benchmark scores can be cooked

Sophira|1 year ago

Honestly, it's kind of silly that this technology is in the hands of companies whose only aim is to make money, IMO.

lenerdenator|1 year ago

Well, originally, OpenAI wasn't supposed to be that kind of organization.

But if you leave someone in the tech industry of SV/SF long enough, they'll start to get high on their own supply and think they're entitled to insane amounts of value, so...

goatlover|1 year ago

It's because they're the ones who could raise the money to make those models. Academics don't have access to that kind of compute. But the free models exist.

gmd63|1 year ago

Why not just copy and paste the model and change the name? That's an even more efficient form of distillation.

wgjordan|1 year ago

Even assuming the model was somehow publicly available in a form that could be directly copied, that would be a more blatant form of copyright infringement. Distillation launders copyrighted material in a way that OpenAI specifically has argued falls under fair use.