(no title)
sxg
|
12 days ago
How can you determine whether it's as good as Opus 4.5 within minutes of release? The quantitative metrics don't seem to mean much anymore. Noticing qualitative differences seems like it would take dozens of conversations and perhaps days to weeks of use before you can reliably determine the model's quality.
johntarter|12 days ago