top | item 44403376

(no title)

ozgune | 8 months ago

I think that's unlikely.

DeepSeek-R1 0528 performs almost as well as o3 in AI quality benchmarks. So, either OpenAI didn't restrict access, DeepSeek wasn't using OpenAI's output, or using OpenAI's output doesn't have a material impact in DeepSeek's performance.

https://artificialanalysis.ai/?models=gpt-4-1%2Co4-mini%2Co3...

discuss

astar1|8 months ago

almost as well as o3? kind of like gemini 2.5? I dug deeper and surprise surprise: https://techcrunch.com/2025/06/03/deepseek-may-have-used-goo...

I am not at all surprised, the CCP views AI race as absolutely critical for their own survival...

orbital-decay|8 months ago

Not everything that's written is worth reading, let alone drawing conclusions from. That benchmark shows different trees each time the author runs it, which should tell you something about it. It also stacks grok-3-beta together with gpt-4.5-preview in the GPT family, making the former appear to be trained on the latter. This doesn't make sense if you check the release dates. And previously it classified gpt-4.5-preview to be in a completely different branch than 4o (which does make some sense but now it's different).

EQBench, another "slop benchmark" from the same author, is equally dubious, as is most of his work, e.g. antislop sampler which is trying to solve an NLP task in a programmatic manner.

Art9681|8 months ago

The benchmarks are not reflective of real world use case. This is why OpenAI dominates B2B. As a business, its in your best interest to save money without sacrificing quality.

"Follow the money."

Businesses are pouring money into the OpenAI API. This is your biggest clue.