(no title)
nopinsight | 4 months ago
https://github.com/vectara/hallucination-leaderboard
If the figures on this leaderboard are to be trusted, many frontier and near-frontier models are already better than the median white-collar worker in this aspect.
Note: The leaderboard doesn't cover tool calling, to be clear.
whatever1|4 months ago
So the min max and median are at 0.
nopinsight|4 months ago
Note that people who write academic papers are quite far from the median white-collar worker.