top | item 41519248

(no title)

igorzij | 1 year ago

Why Claude 3.5 Sonnet is missing from the benchmark? Even if the real reason is different and completely legitimate, or perhaps purely random, it comes across as "claude does better than our new model so we omitted it because we wanted the tallest bars on the chart to be ours". And as soon as the reader thinks that, they may start to question everything else in your work, which is genuinely awesome!

discuss

order

faangguyindia|1 year ago

It's damn slow and overkill for such task.