top | item 43303791

(no title)

ondrsh | 11 months ago

It's much simpler: MCP allows tools to be added at runtime instead of design-time. That's it. And because this can happen at runtime, the user (NOT the developer) can add arbitrary functionality to the LLM application (while the application is running — hence, runtime). One could make the argument that LLM applications with MCP support are conceptually similar to browsers — both let users connect to arbitrary MCP/HTTP servers at runtime.

But the comparison with HTTP is not a very good one, because MCP is stateful and complex. MCP is actually much more similar to FTP than it is to HTTP.

I wrote 2 short blog posts about this in case anyone is curious: https://www.ondr.sh/blog/thoughts-on-mcp

discuss

imtringued|11 months ago

The spec and server docs also contain a helpful explanation:

https://spec.modelcontextprotocol.io/specification/2024-11-0...

https://modelcontextprotocol.io/sdk/java/mcp-server

Also, btw, how long until people rediscover HATEOAS, something which inherently relies on a generalised artificial intelligence to be useful in the first place?

ondrsh|11 months ago

Exactly. An AI-web based on the principles of HATEOAS is the next step, where instead of links, we would have function calls.

As you said, HATEOAS requires a generic client that can understand anything at runtime — a client with general intelligence. Until recently, humans were the only ones fulfilling that requirement. And because we suck at reading JSON, HATEOAS had to use HTML. Now that we have strong AI, we can drop the Hypermedia from 'H'ATEOAS and use JSON instead.

I wrote about that exact thing in Part 2: https://www.ondr.sh/blog/ai-web

phillipcarter|11 months ago

Yeah, maybe it's because I spent too much time working on another open standard (otel), but this seems pretty obvious (and much simpler -- for now).

MCP standardizes how LLMs can call tools at runtime, and how tools can call LLMs at runtime. It's great!

ImPostingOnHN|11 months ago

It sounds like pushing the logic of API calling into one of the many "mcp servers", with the user still needing to go through the manual step of creating accounts on third party services, generating a bunch of different tokens, and dealing with them all.

In essence it seems like an additional shim that removes all the security of API tokens while still leaving the user to deal with them.

Side note, has Tron taught us nothing about avoiding AI MCPs?

PeterBrink|11 months ago

Hey ondrsh, I read your blog post and thought it was very interesting, however I did have a follow-up question:

In your post you say "The key insight is: Because this can happen at runtime, the user (NOT the developer) can add arbitrary functionality to the application (while the application is running — hence, runtime). And because this also works remotely, it could finally enable standardized b2ai software!"

That makes sense, but my question is: how would the user actually do that? As far as I understand, they would have to somehow pass in either a script to spin up their own server locally (unlikely for your everyday user), or a url to access some live MCP server. This means that the host they are using needs an input on the frontend specifically for this, where the user can input a url for the service they want their LLM to be able to talk to. This then gets passed to the client, the client calls the server, the server returns the list of available tools, and the client passes those tools to the LLM to be used.

This is very cool and all, but it just seems like anyone who has minimal tech skills would not have the patience to go and find the MCP server url of their favourite app and then paste it into their chatbot or whatever they're using.

Let me know if I have misunderstood anything, and thanks in advance!

ondrsh|11 months ago

Your understanding is on point.

> As far as I understand, they would have to somehow pass in either a script to spin up their own server locally (unlikely for your everyday user), or a url to access some live MCP server. This means that the host they are using needs an input on the frontend specifically for this, where the user can input a url for the service they want their LLM to be able to talk to. This then gets passed to the client, the client calls the server, the server returns the list of available tools, and the client passes those tools to the LLM to be used.

This is precisely how it would work. Currently, I'm not sure how many host applications (if any) actually feature a URL input field to add remote servers, since most servers are local-only for now. This situation might change once authentication is introduced in the next protocol version. However, as you pointed out, even if such a URL field existed, the discovery problem remains.

But discovery should be an easy fix, in my opinion. Crawlers or registries (think Google for web or Archie for FTP) will likely emerge, so host applications could integrate these external registries and provide simple one-click installs. Apparently, Anthropic is already working on a registry API to simplify exactly this process. Ideally, host applications would automatically detect when helpful tools are available for a given task and prompt users to enable them.

The problem with local-only servers is that they're hard to distribute (just as local HTTP servers are) and that sandboxing is an issue. One workaround is using WASM for server development, which is what mcp.run is doing (https://docs.mcp.run/mcp-clients/intro), but of course this breaks the seamless compatibility.

mountainriver|11 months ago

What does it actually offer over OpenAPI though? If I feed an openapi spec to an LLM it can use it as a tool

ondrsh|11 months ago

It seems like you're describing a scenario where you know at design-time which tools will be included. In that case the benefit of using MCP is less clear.

While you usually get tools that work out of the box with MCP (and thus avoid the hassle of prompting + testing to get working tool code), integrating external APIs manually often results in higher accuracy and performance, as you're not limited by the abstractions imposed by MCP.