top | item 47135056

(no title)

bakugo | 7 days ago

The article claims that every Claude model other than Opus 4.6 reliably fails. This is not true, Sonnet 3.5 answers correctly around half of the time, even though it's such an old model it's not even available on the main API anymore.

discuss

order

No comments yet.