top | item 43246509

(no title)

yompal | 1 year ago

LLMs do well with outcome-described tools and APIs are written as resource-based atomic actions. By describing an API as a collection of outcomes, LLMs don't need to re-reason each time an action needs to be taken.

Also, when an OpenAPI spec gets sufficiently big, you face a need-in-the-haystack problem https://arxiv.org/abs/2407.01437.

discuss

alooPotato|11 months ago

This was insightful. The re-reasoning part makes sense. So basically, MCP should be a dumbed down version of your API that accomplishes a few tasks really well. It ha to be a subset of what your API could do because if it wasn't, it either end up being just as generic as your API or the combinatorial explosion of possible use cases would be too large.

thomasfromcdnjs|1 year ago

Does anyone have any pro tips for large tool collections? (mine are getting fat)

Plan on doing a two layered system mentioned earlier, where the first layer of tool calls is as slim as they can be, then a second layer for more in depth tool documentation.

And/or chunking tools and creating embeddings and also using RAG.

yompal|1 year ago

Funnily enough, a search tool to solve this problem was our product going into YC. Now it’s a part of what we do with wild-card.ai and agents.json. I’d love to extend the tool search functionality for all the tools in your belt

It took us a decently long time to get the search quality good. Just a heads up in case you want to implement this yourself

ahamilton454|1 year ago

I can agree this is a huge problem with large APIs, we are doing it with twilios api and it’s rough

paradite|1 year ago

Thinking from the retrieval perspective, would it make sense to have two layers?

First layer just describes on high level, the tools available and what they do, and make the model pick or route the request (via system prompt, or small model).

Second layer implements the actual function calling or OpenAPI, which then would give the model more details on the params and structures of the request.