(no title)
hkbuilds | 23 hours ago
One thing I've found: even with XML tags, you still need to validate and parse defensively. Models will occasionally nest tags wrong, omit closing tags, or hallucinate new tag names. Having a fallback parser that extracts content even from malformed XML has saved me more than once.
The real win is that XML tags give you a natural way to do few-shot prompting with structure. You can show the model exactly what shape the output should take, and it follows remarkably well.
docjay|18 hours ago
That’s the sign that your prompt isn’t aligned and you’ve introduced perplexity. If you look carefully at the responses you’ll usually be able to see the off-by-one errors before they’re apparent with full on hallucinations. It’ll be things like going from having quotes around filenames to not having them, or switching to single quote, or outputting literal “\n”, or “<br>”, etc. Those are your warning signs to stop before it runs a destructive command because of a “typo.”
My system prompt is just a list of 10 functions with no usage explanations or examples, 304 tokens total, and it’ll go all the way to the 200k limit and never get them wrong. That took ~1,000 iterations of name, position, punctuation, etc., for Opus 4.6 (~200 for Opus 4.5 until they nerfed it February 12th). Once you get it right though it’s truly a different experience.