Thinking from the retrieval perspective, would it make sense to have two layers?
First layer just describes on high level, the tools available and what they do, and make the model pick or route the request (via system prompt, or small model).
Second layer implements the actual function calling or OpenAPI, which then would give the model more details on the params and structures of the request.
That approach does a lot better, but LLMs still have positional bias problem baked into the transformer architecture (https://arxiv.org/html/2406.07791v1).
This is where the LLM biases selecting information earlier in the prompt than later, which is unfortunate for tool selection accuracy.
Since 2 steps are required anyways, might as well use a dedicated semantic search for tools like in agents.json.
paradite|1 year ago
First layer just describes on high level, the tools available and what they do, and make the model pick or route the request (via system prompt, or small model).
Second layer implements the actual function calling or OpenAPI, which then would give the model more details on the params and structures of the request.
yompal|1 year ago
Since 2 steps are required anyways, might as well use a dedicated semantic search for tools like in agents.json.