WingNews logo WingNews
top | new | best | ask | show | jobs
top | item 47090592

(no title)

holbrad | 9 days ago

You are misreading the benchmark.

https://artificialanalysis.ai/#aa-omniscience-hallucination-...

If you look at the results 3.0 hallucinates an awful lot, when it's wrong.

It's just not wrong that often.

(And it looks like 3.1 does better on both fronts)

discuss

order

No comments yet.

powered by hn/api // news.ycombinator.com