(no title)
mxwsn
|
9 months ago
Gemini has beat it already, but using a different and notably more helpful harness. The creator has said they think harness design is the most important factor right now, and that the results don't mean much for comparing Claude to Gemini.
throwaway314155|9 months ago
silvr|9 months ago
Claude got stuck reasoning its way through one of the more complex puzzle areas. Gemini took a while on it also, but made it through. I don't that difference can be fully attributed up to the harnesses.
Obviously, the best thing to do would be to run a SxS in the same harness of the two models. Maybe that will happen?
samrus|9 months ago
11101010001100|9 months ago