top | item 47109958

(no title)

vntok | 7 days ago

Reproducing experimental results across models and vendors is trivial and cheap nowadays.

discuss

order

BoredPositron|7 days ago

Not if anthropic goes further in obfuscating the output of claude code.

vntok|7 days ago

Why would you test implementation details? Test what's delivered, not how it's delivered. The thinking portion, synthetized or not, is merely implementation.

The resulting artefact, that's what is worth testing.