(no title)
coder543 | 9 days ago
Artificial Analysis isn't perfect, but it is an independent third party that actually runs the benchmarks themselves, and they use a wide range of benchmarks. It is a better automated litmus test than any other that I've been able to find in years of watching the development of LLMs.
And the gap has been rapidly shrinking: https://www.youtube.com/watch?v=0NBILspM4c4&t=642s
zozbot234|9 days ago
lancebeet|9 days ago
coder543|9 days ago
As I said, I have been following this stuff closely for many years now. My opinion is not informed just by looking at a single chart, but by a lot of experience. The chart is less fishy than blanket statements about the closed models somehow being way better than the benchmarks show.