top | item 46819854

(no title)

onnimonni | 1 month ago

Would someone know if their eval tests are open source and where I could find them? Seems useful for iterating on Claude Code behaviour.

discuss

order

JamesSwift|1 month ago

I also was looking for specific info on the evals, because I wanted to see if they were separately confirming that shoving the skills into the main context didnt degrade the non-skills evals. Thats the other side of skills other than ability to the thing, they dont pollute the main context window with unnecessary information.