top | item 42828564

(no title)

just-ok | 1 year ago

It’s not better than o1. And given that OpenAI is on the verge of releasing o3, has some “o4” in the pipeline, and Deepseek could only build this because of o1, I don’t think there’s as much competition as people seem to imply.

I’m excited to see models become open, but given the curve of progress we’ve seen, even being “a little” behind is a gap that grows exponentially every day.

discuss

order

crocowhile|1 year ago

When the price difference is so high and the performance so close, of course you have a major issue with competition. Let alone the fact this is fully open source.

Most importantly, this is a signal: openAI and META are trying to build a moat using massive hardware investments. Deepseek took the opposite direction and not only does it show that hardware is no moat, it basically makes fool of their multibillion claims. This is massive. If only investors had the brain it takes, we would pop this bubble alread.

diego_sandoval|1 year ago

Why should the bubble pop when we just got the proof that these models can be much more efficient than we thought?

I mean, sure, no one is going to have a monopoly, and we're going to see a race to the bottom in prices, but on the other hand, the AI revolution is going to come much sooner than expected, and it's going to be on everyone's pocket this year. Isn't that a bullish signal for the economy?

riffraff|1 year ago

But it took the deepseek team a few weeks to replicate something at least close to o1.

If people can replicate 90% of your product in 6 weeks you have competition.

chii|1 year ago

Not only a few weeks, but more importantly, it was cheap.

The moat for these big models were always expected to be capital expenditure for training costing billions. It's why these companies like openAI etc, are spending massively on compute - it's building a bigger moat (or trying to at least).

If it can be shown, which seems to have been, that you could use smarts and make use of compute more efficiently and cheaply, but achieve similar (or even better) results, the hardware moat bouyed by capital is no longer.

i'm actually glad tho. An opensourced version of these weights should ideally spur the type of innovation that stable diffusion did when theirs was released.

nialv7|1 year ago

o1-preview was released Sep 12, 2024. So DeepSeek team probably had a couple of months.

Mond_|1 year ago

> Deepseek could only build this because of o1, I don’t think there’s as much competition as people seem to imply

And this is based on what exactly? OpenAI hides the reasoning steps, so training a model on o1 is very likely much more expensive (and much less useful) than just training it directly on a cheaper model.

karmasimida|1 year ago

Because literally before o1, no one is doing COT style test time scaling. It is a new paradigm. The talking point back then, is the LLM hits the wall.

R1's biggest contribution IMO, is R1-Zero, I am fully sold with this they don't need o1's output to be as good. But yeah, o1 is still the herald.

acchow|1 year ago

> even being “a little” behind is a gap that grows exponentially every day

This theory has yet to be demonstrated. As yet, it seems open source just stays behind by about 6-10 months consistently.

resters|1 year ago

> It’s not better than o1.

I thought that too before I used it to do real work.

havkom|1 year ago

Yes. It shines with real problems.