(no title)
pants2
|
9 days ago
Strange that you say that because the general consensus (and my experience) seems to be the opposite, as well as the AA-Omniscience Hallucination Rate Benchmark which puts 3.0 Pro among the higher hallucinating models. 3.1 seems to be a noticeable improvement though.
maxwellcoffee|9 days ago
Gemini 3.1 is the top spot, followed by 3.0 and then opus 4.6 max
holbrad|9 days ago
Gemini 3.0 gets a very high score because it's very often correct, but it does not have a low hallucination rate.
https://artificialanalysis.ai/#aa-omniscience-hallucination-...
It looks like 3.1 is a big improvement in this regard, it hallucinates a lot less.
fnord123|9 days ago
As sibling comment says, AA-Omniscience Hallucination Rate Benchmark puts Gemini 3.0 as the best performing aside from Gemini 3.1 preview.
https://artificialanalysis.ai/evaluations/omniscience
holbrad|9 days ago
https://artificialanalysis.ai/#aa-omniscience-hallucination-...
If you look at the results 3.0 hallucinates an awful lot, when it's wrong.
It's just not wrong that often.
(And it looks like 3.1 does better on both fronts)
tempestn|9 days ago