Even starting at 30%, the MMLU graph is false. The four bars are wrong. Even their own 73,7% is not at the right height. The Mixtral 71.4% is below the 70% mark of the axis.
This is really the kind of marketing trick that makes me avoid a provider / publisher. I can't build trust this way.
tylermw|1 year ago
familiartime|1 year ago
I take issue with their choice of bar ordering - they placed the lowest-performing model directly next to theirs to make the gap as visible as possible, and shoved the second-best model (Grok-1) as far from theirs as possible. Seems intentional to me. The more marketing tricks you pile up in a dataviz, the less trust I place in your product for sure.
pandastronaut|1 year ago
radicality|1 year ago
occamrazor|1 year ago
nerpderp82|1 year ago
I can't find the section, but at the end of one of https://www.youtube.com/@aiexplained-official/videos he runs down a deep dive of the questions and answers in MMLU, and there are so many typos, omissions, and errors in the questions and the answers that it should no longer be used.
This is it, with the corret time offset into the video https://www.reddit.com/r/OpenAI/comments/18i02oe/mmlu_is_not...
The original longer complaint against MMLU https://www.youtube.com/watch?v=hVade_8H8mE
dskhudia|1 year ago
tartrate|1 year ago