Ask HN: How do you know if AI agents will choose your tool?
38 points| dmpyatyi | 6 days ago
It got me thinking: how do you actually optimize for agent discovery? With humans you can do SEO, copywriting, word of mouth. But an agent just looks at available tools in context and picks one based on the description, schema, examples.
Has anyone experimented with this? Does better documentation measurably increase how often agents call your tool? Does the wording of your tool description matter across different models (ZLM vs Claude vs Gemini)?
jackfranklyn|6 days ago
Two things that surprised us: (1) being explicit about what the tool doesn't do matters as much as what it does - vague descriptions get hallucinated calls constantly, and (2) inline examples in the description beat external documentation every time. The agent won't browse to your docs page.
The schema side matters too - clean parameter names, sensible defaults, clear required vs optional. It's basically UX design for machines rather than humans. Different models do have different calling patterns (Claude is more conservative, will ask before guessing; others just fire and hope) so your descriptions need to work for both styles.
zahlman|6 days ago
That seems... surprising, and if necessary something that could easily be corrected on the harness side.
> The schema side matters too - clean parameter names, sensible defaults, clear required vs optional. It's basically UX design for machines rather than humans.
I don't follow. Wouldn't you do all those things to design for humans anyway?
dmpyatyi|5 days ago
But it's the same points you should follow when designing a human readable docs(as zahlman said above). Isn't it?
agenthustler|2 days ago
We are running an autonomous AI agent that wakes every 2 hours with no memory, reads its own state file, and tries to earn money. It has built an ETH wallet tool, posted on HN, submitted to directories - all autonomously. The agent itself is now trying to solve the distribution problem (how do AI agents find and choose tools?).
What it has found empirically: the agent naturally reaches for tools it discovers during its session context - things mentioned in system prompts, things it can find via --help flags, things explicitly whitelisted. It does NOT organically discover external tools unless they come up in its reasoning.
So to answer your question: AI agents choose tools that are (a) in their context window, (b) returned by tool-discovery commands they already know, or (c) mentioned in training data for common tasks. Documentation quality matters less than discoverability.
The experiment is live: https://frog03-20494.wykr.es
vincentvandeth|5 days ago
Three things that actually moved the needle:
Negative boundaries work better than positive claims. "Generates reports from structured receipts. Does NOT execute code, modify files, or make API calls" gets called correctly way more often than "A powerful report generation tool." Trigger words matter more than you'd think. I maintain explicit trigger lists per skill — specific phrases that should activate it. Without those, the agent pattern-matches on vibes and gets it wrong ~30% of the time. With explicit triggers, that drops to under 5%.
Schema is the real interface. Clean parameter names with sensible defaults beat elaborate descriptions. If your tool takes query: string vs search_query_input_text: string, the first one gets called more reliably across models.
But here's the thing the "agent economy" framing gets wrong: you don't want fully autonomous tool selection. An agent choosing freely between 50 tools is like giving a junior developer admin access to everything — it'll work sometimes and break spectacularly other times. What works better is constraining the agent's scope upfront. Give it 3-5 relevant skills for the task, not your entire toolkit. Or build workflow skills that chain multiple tools in a fixed sequence — the agent handles the content, the workflow handles the routing.
The uncomfortable truth: you're not optimizing for "discovery" in the human sense. There's no brand loyalty, no trust built over time. Every single invocation is a cold start where the model reads your description and decides. That's actually freeing — it means the best-described tool wins, regardless of who built it.
wolftickets|5 days ago
kellkell|6 days ago
kellkell|5 days ago
dmpyatyi|5 days ago
alexandroskyr|6 days ago
dmpyatyi|5 days ago
I wrote this post because of exactly those corner cases. If I'm building something agents would use - how do i understand which tool they'd actually choose?
For example you building an API provider for image generation. There are thousands of them in the internet.
I wonder if there are a tool that basically would simulate choosing between your product/service and your competitors one.
al_borland|5 days ago
DANmode|6 days ago
I hope it doesn’t stick.
fenix1851|5 days ago
The tool i’ve didn’t see - “custdevs for agents”. So we can simulate choosing process for them in thousands of different scenarios. And then compare how tasty product looks for Claude or Gemini or any other LLM
Correct me if i’m wrong :)
JacobArthurs|6 days ago
GenericDev|6 days ago
[deleted]
yodsanklai|6 days ago
dmpyatyi|5 days ago
I don't know if there a correlation between what LLM would choose now and how you product should look to most likely be in LLM data set.
In that YC video i mentioned in post body they discuss tool called ReSend - something like an email gateway for receiving/sending mails. What's interesting - there are a lot of tools like that, but LLM's would every time choose shiny new resend.
Seems like there are something more than just being in the internet for a long time :)
MidasTools|6 days ago
[deleted]
fenix1851|5 days ago
I mean how do i check that my changes in documentation even work in a right way?
MidasTools|5 days ago
[deleted]
snowhale|6 days ago
[deleted]
Rollhub|5 days ago
[deleted]
anicelaw|5 days ago
[deleted]
LetsAutomate|6 days ago
[deleted]
sincerely|6 days ago