top | item 43687462 (no title) ezyang | 10 months ago Lmarena isn't that useful anymore lol discuss order hn newest int_19h|10 months ago I actually agree with that, but it's generally better than other scores. Also, the quote is like a year old at this point.In practice you have to evaluate the models yourself for any non-trivial task.
int_19h|10 months ago I actually agree with that, but it's generally better than other scores. Also, the quote is like a year old at this point.In practice you have to evaluate the models yourself for any non-trivial task.
int_19h|10 months ago
In practice you have to evaluate the models yourself for any non-trivial task.