(no title)
mmiyer
|
1 year ago
I guess it's because it has the highest score of all models in instruction following, 20 points higher then Opus, which compensates for shortcomings elsewhere (e.g. in language), and which wouldn't necessarily translate to human evaluation of usefulness.
simonw|1 year ago