top | item 46041274

(no title)

ofermend | 3 months ago

Can't wait to try Opus 4.5

We just evaluated it for Vectara's grounded hallucination leaderboard: it scores at 10.9% hallucination rate, better than Gemini-3, GPT-5.1-high or Grok-4.

https://github.com/vectara/hallucination-leaderboard

discuss

order

No comments yet.