top | item 46977461

(no title)

galsapir | 19 days ago

I've been writing about how I use AI tools as an researcher working in health AI — specifically the tension between leveraging them and staying engaged enough to catch when they're wrong. This post is about a specific version of that problem: the models have gotten good enough that my default is to trust the output, and the threshold for "worth checking" keeps drifting upward. So I built a simple Claude Code skill that sends high-stakes work to a different model family for a second opinion — one call, not a multi-agent debate. The honest result: the first real test (reviewing an architecture spec) scored maybe 6/10. It caught one genuine security finding and missed the deeper domain questions entirely. That gap maps onto something I keep running into in evals — tools can check structural form (missing error handling, security anti-patterns) but struggle with essence (does this actually work the way the spec assumes? are the clinical guardrails robust?). Still worth it as a lightweight intervention against the drift toward not checking at all. The skill is open source if anyone wants to try or improve it.

discuss

No comments yet.