top | item 47134154

(no title)

siva7 | 5 days ago

This is probably the greatest one-time AI "Benchmark" ever made. The foundation companies have been gaming traditional benchmarks for years so that no one can really match those numbers into real-world experience. Car wash test tells me on the other hand what kind of intelligence i can expect.

discuss

order

XCSme|5 days ago

I also don't trust the maxbenched results.

I am thus making my own benchmarks: https://aibenchy.com

Otterly99|4 days ago

Maybe I am missing something obvious on the website, but where is the documentation? Where do you explain what each number mean, or at least a short overview of what the models are being tested on?

andai|5 days ago

In your benchmark, GPT 5 Nano is basically tied with Opus?

vasco|5 days ago

For me it's interesting because no normal person I know would ever inject "because its better for the environment" in anything so small scale so not only it shows they suck, it shows how easy it is to inject side-ideology into simple exchanges.

3rodents|5 days ago

You don’t know enough people, then. There are a lot of environmentally conscious people who would absolutely first think “because it is close we should walk” and then follow up with the logical conclusion that you can’t walk to wash your car. Many people communicate by sharing their thinking process, I can think of many people who would share their ideology as it pertains to a question like this. A pragmatic environmentalist (hopefully that is all of them) would know that their ideology isn’t consequential but could certainly mention it. After all, you may need to drive your car to the car wash to wash it, but do you need to wash it? Are the chemicals used by the car wash harmful? Are there better ways to keep a car maintained?

xyproto|5 days ago

Referring to "the normal people you know" is purely anecdotal evidence and can't be used to infer anything at all about "side-ideology". Perhaps you only know people that don't care about the environment?