(no title)
gabiteodoru | 4 months ago
While setting this up, I realized I hadn't used chat templates in my original measurements (rookie mistake with an Instruct model!). Re-running with proper methodology completely flips the results - the terse version actually wins.
I'll add a correction note to the article once AWS/Medium comes back online, and will write a proper follow-up with all the corrected experiments. Your comment literally made the research better - thank you!
No comments yet.