(no title)
Kim_Bruning | 3 days ago
I made a cursed CPU in the game 'Turing Complete'; and had an older version of claude build me an assembler for it?
Good luck finding THAT in the training data. :-P
(just to be sure, I then had it write actual programs in that new assembly language)
withinboredom|3 days ago
Claude 4.5: not overfitted too much -- does the right thing 6/10 times.
Claude 4.6: overfitted -- does the right thing 2/10 times.
OpenAI 5.3: overfitted -- does the right thing 3/10 times.
These aren't perfect benchmarks, but it lets me know how much babysitting I need to do.
My point being that older Claude models weren't overfitted nearly as much, so I'm confirming what you're saying.
Kim_Bruning|3 days ago
At any rate, with an assembler, you end up with a lot of random letter-salad mnemonics with odd use cases, so that is very likely to tokenize in interesting ways at the very least.
unknown|3 days ago
[deleted]