top | item 42833401

(no title)

Can someone share a youtube showing DeepSeek vs others? I glanced through comments and seeing lots of opinions, but no (easy) evidence. I would like to see a level of thoroughness that I could not do myself. Not naysaying one model over another, just good ole fashion elbow grease and scientific method for the layperson. I appreciate the help.

discuss

shihab|1 year ago

Here [1] is the leaderboard from chabot arena, where users vote on the output of two anonymous models. Deepseek R1 needs more data points- but it already climbed to No 1 with Style control ranking, which is pretty impressive.

Link [2] to the result on more standard LLM benchmarks. They conveniently placed the results on the first page of the paper.

[1] https://lmarena.ai/?leaderboard

[2] https://arxiv.org/pdf/2501.12948 (PDF)