(no title)
lend000 | 21 hours ago
I like this benchmark that competes models against one another in competitive environments, which seems like it can't really be gamed: https://gertlabs.com
lend000 | 21 hours ago
I like this benchmark that competes models against one another in competitive environments, which seems like it can't really be gamed: https://gertlabs.com
Aurornis|11 hours ago
That’s exactly what I said, though. The headline we’re commenting under claims they’re Sonnet 4.5 level but they’re not.
I don’t disagree that they’re powerful for open models. I’m pointing out that anyone reading these headlines who expects a cheap or local Sonnet 4.5 is going to discover that it’s not true.