(no title)
farkin88 | 7 months ago
What’s still missing is structured error semantics. Right now, when a tool explodes the LLM only sees a raw stack trace, so it can’t decide whether to retry or bail. Today, I saw another Show HN post on MCP here: https://news.ycombinator.com/item?id=44778968 which converts exceptions into a small JSON envelope (error, details, isRetryable, etc.) that agents can interpret. Dropping that into mcp-use’s client loop would give you observability + smarter retries for almost free.
I might be missing some important nuances here between client side and server side MCP error handling, which I’ll leave to the authors to discuss. Just thought there might be a good open source collab opportunity here.
MutedEstate45|7 months ago
The interesting part is dual error representation: devs need full stack traces for debugging, but LLMs need clean structured metadata for retry decisions. Plus you want error telemetry flowing to observability systems.
Question for the mcp-use authors: How do you handle error propagation in multi-server setups? If Server A times out but Server B is healthy, does the agent know which server failed and whether to retry or switch servers? And are you capturing error patterns to identify flaky servers or consistently failing tool calls? The retry orchestration across multiple MCP servers seems like a really interesting problem.
pzullo|7 months ago
Regarding the multi server failure case: (without server manager) today if one of the server dies the agent will keep going, I do not think this is a particularly thought through decision, probably the client should error out, or let the agent know that the server is dead. (with server manager) the agent will try to connect to the dead server, get an error, possibly retry, but if the server keeps being unable to connect to, the agent will eventually bail. Indeed it is an interesting problem. How do you see the responsibility split here ?
Regarding the flakyness, ultimate dream, but requires some more work, I think that monitoring this is something that the client has a privileged position of doing, we will do it for sure. I think this is going to be great feedback for companies building servers. Happy to coordinate on ideas on how to do this best.
pzullo|7 months ago
The problem with the package above is that it is server side, and I think it should be. Server is the one that knows (has responsibility to know) if a tool is retryable and has informations about the error, and fairly so.
I see though how some sort of improved error formatting could be introduced in the client as well, but it shouldn't contain any logic about the error, rather format the error in the best way possible (in the direction in which it is best understandable by LLMs).