top | item 45970227

(no title)

CephalopodMD | 3 months ago

What I'm getting from this thread is that people have their own private benchmarks. It's almost a cottage industry. Maybe someone should crowd source those benchmarks, keep them completely secret, and create a new public benchmark of people's private AGI tests. All they should release for a given model is the final average score.

discuss

order

No comments yet.