top | item 45646795

(no title)

Thanks so much for this challenge! I just ran the experiment with i = 1 + i and you're absolutely right - it breaks my theoretical framework (same semantic information, but much higher perplexity).

While setting this up, I realized I hadn't used chat templates in my original measurements (rookie mistake with an Instruct model!). Re-running with proper methodology completely flips the results - the terse version actually wins.

I'll add a correction note to the article once AWS/Medium comes back online, and will write a proper follow-up with all the corrected experiments. Your comment literally made the research better - thank you!

discuss

No comments yet.