It's much simpler: MCP allows tools to be added at runtime instead of design-time. That's it. And because this can happen at runtime, the user (NOT the developer) can add arbitrary functionality to the LLM application (while the application is running — hence, runtime). One could make the argument that LLM applications with MCP support are conceptually similar to browsers — both let users connect to arbitrary MCP/HTTP servers at runtime.
But the comparison with HTTP is not a very good one, because MCP is stateful and complex. MCP is actually much more similar to FTP than it is to HTTP.
Also, btw, how long until people rediscover HATEOAS, something which inherently relies on a generalised artificial intelligence to be useful in the first place?
Hey ondrsh, I read your blog post and thought it was very interesting, however I did have a follow-up question:
In your post you say "The key insight is: Because this can happen at runtime, the user (NOT the developer) can add arbitrary functionality to the application (while the application is running — hence, runtime). And because this also works remotely, it could finally enable standardized b2ai software!"
That makes sense, but my question is: how would the user actually do that? As far as I understand, they would have to somehow pass in either a script to spin up their own server locally (unlikely for your everyday user), or a url to access some live MCP server. This means that the host they are using needs an input on the frontend specifically for this, where the user can input a url for the service they want their LLM to be able to talk to. This then gets passed to the client, the client calls the server, the server returns the list of available tools, and the client passes those tools to the LLM to be used.
This is very cool and all, but it just seems like anyone who has minimal tech skills would not have the patience to go and find the MCP server url of their favourite app and then paste it into their chatbot or whatever they're using.
Let me know if I have misunderstood anything, and thanks in advance!
The most important thing for developers to understand when it comes to MCP: MCP is a protocol for dynamically loading additional capabilities into an AI application, e.g. Claude Desktop, Cursor, Highlight.ai etc...
MCP is not something most people need to bother with unless you are building an application that needs extension or you are trying to extend an application (like those I listed above). Under the hood the MCP is just an interface into the tools API.
I saw you were downvoted and could not understand why, so I'm both upvoting and replying. This is all correct. MCP is, realistically speaking, the extension API for Claude Desktop and Cursor. It's really cool if you do want to extend those apps, but that's all it's for. The article in this case is really confusing and unnecessary.
1) Ok, so you are reinventing SOAP or WSDL or whatever... did that ever go well? How and why is this different from every prior attempt to create the one true API layer?
2) Is this meaningfully different from just having every API provide a JavaScript SDK to access it, and then having the model write code? That's how humans solve this stuff.
3) If the AI is actually as smart at doing tasks like writing clients for APIs as people like to claim, why does it need this to be made machine readable in the first place?
1) Valid point, this could haven been wsdl/swagger. But the MCP spec supports spinning up local applications and communicate via stdio which open api cannot do.
2 + 3) having a few commands that AI knows it should call and confidently so without security concern, is better than just give AI permision to do every thing under the sun and tell it to code a program doing so.
The prompt for the later is also much more complex and does not work as predictably.
I wouldn't call it another form of API. It's more like an SDK. If you were accessing a REST API from Android, iOS, Windows, Mac, Firefox, they'd be mostly the same. But an SDK for Android and an SDK for iOS has been built for the platform. Often the SDK encapsulates the official API.
That's a direct answer for (2) too - instead of writing a JS SDK or Swift SDK or whatever, it's an AI SDK and shared across Claude, OpenAI, Groq, and so on.
(3) is exactly related to this. The AI has been trained to run MCPs, viewing them as big labeled buttons in their "mind".
I think you got the questions spot on and the answers right there as well.
MCP tends to be much simpler, less powerful than an API that you’d actually try to develop against. The LLMs need the most simplified access patterns possible
Most people hate SOAP and WSDL. You can argue most web APIs are reinventing them in the sense that you could reimplement them with WSDL, to get worse versions of them.
What the article doesn't say (well, there's a lot it doesn't say) is that this protocol was created by Anthropic but is being adopted more widely.
MCP reminds me of a new platform opportunity akin to the Apple App Store.
It's rapidly adopted, with offerings from GitHub, Stripe, Slack, Google Maps, AirTable, etc. Many more non-official integrations are already out there. I expect this will only gain adoption over the coming year.
Yes. The article comes across as a response from an LLM chat. I think that it's OK to write blog posts with AI assistance, and I like the logical and simple writing style that these models output.
But with MCP there's not a whole lot of information out there for LLMs to digest and so perhaps for that reason the article is not particularly insightful.
Not only that, but the whole section on "Consider these scenarios" simply described the same thing as an API each time, but added words like "smoothly" and "richer" to make it sound different.
I honestly think most of the article was written by an LLM.
MCP strikes me as roughly equivalent to HTML. Headline features like dynamic “tool” discovery (and more!) are solved well with HTML.
MCP is probably easier for clients to implement but suffers from poor standardization, immaturity and non-human readability. It clearly scratches an itch but I think it’s a local-minimum that requires a tremendous amount of work to implement.
To be honest I don't understand why this is needed. All the leading AI models can already write code that interfaces perfectly with well-known APIs, and for the niche-APIs I can supply the docs of that API and the model will understand.
So all that's needed are API docs. Or what am I missing?
The success rate of this is impractically low. APIs are dirty, inconsistent things. Real-world connection to obscure APIs is a matter of hard sleuthing. Docs are wrong, endpoints are broken, auth is a nightmare. These APIs need to be massaged in advance and given a sanity-wrapper if you want any semblance of reliable success when a model calls them.
Writing code for interfaces is an extra "cognitive layer" for the AI, just like it would be for a human.
Let's say you want to add or delete Jira tickets. A MCP is like a big labeled button for the AI to do this, and it doesn't come with the token cost of reading an API or the possibility of making a mistake while accessing it.
Sorry but I'm extremelly annoyed with this idiotic take that many people seem to have. Is it that easy to prompt AI to write code and call an API predictably?
The value of MCP then depends on it's adoption. If I need to write an MCP adapter for everything, it's value is little. If everyone (API owners, OS, Clouds, ...) puts in the work to have an MCP compatible interface it's valuable.
In a world where I need to build my own X-to-USB dongle for every device myself, I wouldn't use USB, to stay with the articles analogy.
The MCP protocol is very similar to Language Server Protocol (LSP) in design. LSP has the same requests, responses, and notifications setup. Also the initialization with what capabilities the server has it the same.
Normally, LSP when running on a remote server, you would use a continuous (web)socket instead of API requests. This helps with the parsing overhead and provides faster response for small requests. Also requests have cancellation tokens, which makes it possible to cancel a request when it became unnecessary.
I'd like to recommend another protocol—ANP (AgentNetworkProtocol).
While similar to MCP, ANP is significantly different. ANP is specifically designed for agents, addressing communication issues encountered by intelligent agents. It enables identity authentication and collaboration between any two agents.
Key differences include:
ANP uses a P2P architecture, whereas MCP follows a client-server model.
ANP relies on W3C DID for decentralized identity authentication, while MCP utilizes OAuth.
ANP organizes information using Semantic Web and Linked Data principles, whereas MCP employs JSON-RPC.
MCP might excel at providing additional information and tools to models and connecting models to the existing web. In contrast, ANP is particularly effective for collaboration and communication between agents.
I built https://skeet.build where anyone can try out mcp for cursor and dev tools without a lot of setup
Mostly for workflows I like:
- start a PR with a summary of what I just did
- slack or comment to linear/Jira with a summary of what I pushed
- pull this issue from sentry and fix it - pull this linear issue and do a first pass
- pull in this Notion doc with a PRD then create an API reference for it based on this codebase, then create a new Notion page with the reference
MCP tools are what the LLM uses and initiates
MCP prompts are user initated workflows
MCP resources is the data that the APIs provide and structure of that data (because porting APIs to MCPs are not as straight forward)
Anyways please give me feedback!
It seems like MCP is a pretty cool protocol, but has anyone seen any actually useful integrations?
I've played a lot with the FileSystem MCP server but couldn't get it to do something useful that I can't already do faster on my own. For instance, asking it how many files have word "main" in it. It returns 267, but in reality there are 12k.
Looks promising, but I am still looking for useful ways to integrate it into my workflow.
In Cursor for example, it gives the agent the ability to connect to the browser to gather console logs, network logs and take screenshots. The agent will often invoke the tools automatically when it is debugging or verifying it's work, without being explicitly prompted to do so.
It's a little bit of a set up process, as it requires a browser extension on top of the MCP configuration.
So, now when Roo Code does tasks for me, it takes notes and searches memory.
It’s good as a means to get a quick POC running, for dev oriented use cases.
I have seen very few implementations that use anything but the tools capabilities though.
The complete lack of auth consideration and the weird orchestration (really the “client” manages its own “server” processes), make me doubt it’s going to get serious adoption in prod. It’s not something I’d have a lot of confidence in supporting for non dev users.
Most LLMs including Claude struggle using the @modelcontextprotocol/server-filesystem server - it's way too complex (and tools like Goose wrap ripgrep in an MCP to handle it). A simple MCP Server with the SDKs can easily be less than 20 lines of code and be useful.
I wrote mcp-hfspace to let you connect to Hugging Face Spaces; that opens up a lot of image generation, vision, audio transcription and other services that can be integrated quickly and easily in to your Host app.
i use it to give claude internet access. it's pretty helpful! i run searxng locally, and have an mcp server that calls it. repo: https://github.com/aeon-seraph/searxng-mcp
If the goal is to build AI agents, I’d think the maximum utility from them would come from building agents that can use the UI that humans use. Or at the very least, the lesser available API. MCP is yet another interface to services when there’s already at least two out there available to use
One thing I like about MCP's decision _not_ to just use HTTP APIs is that it makes it possible to build and distribute tools that people can use without having to set up a server for them. This goes a long way for enabling open-source and locally-running tools.
I don't really get how MCP is materially different from a standardized API interface. It will be interesting to see if the other big model providers decide to standardize on MCP or push their own competing standards.
So how does MCP differ from a regular library like aisuite or vercel ai sdk?
Regular SDK lib:
- Integration Effort: just like MCP
- Real-Time Communication - Sure
- Dynamic Discovery - obviously. just call refresh or whatever
- Scalability - infinite, it is a library
- Security & Control - just like mcp
I suspect that an MCP is just a rebranded API. We have also seen these sorts of extensibility mechanisms before. Browser extensions, Object Linking and Embedding, Dynamic Data Exchange, and Visual Studio Code extensions are all examples of having a standard API which allows lots of different things to plug into it.
Tangential but it's a shame people are often missing the opportunity to align jsonrpc with programming language semantics 1) name methods so they can be methods in programming langugage (no "/" in names like MCP does) 2) use arity based calling convention (no named params, use array instead because that's what your programming language does when you define functions or methods; if you want named params, just use object as first argument).
Following those two principles means your implementation ends up as simple class, with simple methods, with simple params - possibly using decorators to expose it as rpc and perform runtime type assertion for params (exposing rpc, server side) and result (using rpc, client side) – consuming jsonrpc now looks like using any ordinary library/package that happens to have async methods (this is important, there is no special dialect of communication, it's all ordinary semantics everybody is already used to, your code on client and server side doesn't jump between mapping to/from language and jsonrpc, there is a lot of complexity that's collapsed, code looks minimal, it's small, natural to read etc).
Notifications also map naturally to well established pattern (ie. event emitter in nodejs).
And yes, that's my main criticism of MCP – you're making standard for communication meant to be used from different languages, why adding this silly, unnecessary complexity by using "/" in method names? It frankly feels like amateur mistake by somebody who thinks it should be a bit like REST where method is URL path.
Another tangent – this declaration of available enpoints is unnecessarily complicated – you can just use url: file://.. scheme to start process on that executable with stdin/stdout as communication channels (this idea is great btw, good job!), ws:// or wss:// for websocket comms to existing service and http:// or https:// for jsonrpc over http (no notifications).
[+] [-] ondrsh|1 year ago|reply
But the comparison with HTTP is not a very good one, because MCP is stateful and complex. MCP is actually much more similar to FTP than it is to HTTP.
I wrote 2 short blog posts about this in case anyone is curious: https://www.ondr.sh/blog/thoughts-on-mcp
[+] [-] imtringued|1 year ago|reply
https://spec.modelcontextprotocol.io/specification/2024-11-0...
https://modelcontextprotocol.io/sdk/java/mcp-server
Also, btw, how long until people rediscover HATEOAS, something which inherently relies on a generalised artificial intelligence to be useful in the first place?
[+] [-] phillipcarter|1 year ago|reply
MCP standardizes how LLMs can call tools at runtime, and how tools can call LLMs at runtime. It's great!
[+] [-] PeterBrink|1 year ago|reply
In your post you say "The key insight is: Because this can happen at runtime, the user (NOT the developer) can add arbitrary functionality to the application (while the application is running — hence, runtime). And because this also works remotely, it could finally enable standardized b2ai software!"
That makes sense, but my question is: how would the user actually do that? As far as I understand, they would have to somehow pass in either a script to spin up their own server locally (unlikely for your everyday user), or a url to access some live MCP server. This means that the host they are using needs an input on the frontend specifically for this, where the user can input a url for the service they want their LLM to be able to talk to. This then gets passed to the client, the client calls the server, the server returns the list of available tools, and the client passes those tools to the LLM to be used.
This is very cool and all, but it just seems like anyone who has minimal tech skills would not have the patience to go and find the MCP server url of their favourite app and then paste it into their chatbot or whatever they're using.
Let me know if I have misunderstood anything, and thanks in advance!
[+] [-] mountainriver|1 year ago|reply
[+] [-] campbel|1 year ago|reply
If you are building your own applications, you can simply use "Tools APIs" provided by the LLM directly (e,.g. https://platform.openai.com/docs/assistants/tools).
MCP is not something most people need to bother with unless you are building an application that needs extension or you are trying to extend an application (like those I listed above). Under the hood the MCP is just an interface into the tools API.
[+] [-] gsibble|1 year ago|reply
MCP is not all it's cracked up to be.
[+] [-] electroly|1 year ago|reply
[+] [-] saurik|1 year ago|reply
2) Is this meaningfully different from just having every API provide a JavaScript SDK to access it, and then having the model write code? That's how humans solve this stuff.
3) If the AI is actually as smart at doing tasks like writing clients for APIs as people like to claim, why does it need this to be made machine readable in the first place?
[+] [-] nsonha|1 year ago|reply
2 + 3) having a few commands that AI knows it should call and confidently so without security concern, is better than just give AI permision to do every thing under the sun and tell it to code a program doing so.
The prompt for the later is also much more complex and does not work as predictably.
[+] [-] no_wizard|1 year ago|reply
If it was truly intelligent it could reason about things like API specifications without any precursors or shared structure, but it can’t.
Are LLMs powerful? Yes. Is current “AI” simply a re-brand of machine learning? IMO, also yes
[+] [-] muzani|1 year ago|reply
That's a direct answer for (2) too - instead of writing a JS SDK or Swift SDK or whatever, it's an AI SDK and shared across Claude, OpenAI, Groq, and so on.
(3) is exactly related to this. The AI has been trained to run MCPs, viewing them as big labeled buttons in their "mind".
I think you got the questions spot on and the answers right there as well.
[+] [-] jes5199|1 year ago|reply
[+] [-] fulafel|1 year ago|reply
[+] [-] punkpeye|1 year ago|reply
So if you are here for MCP, I will use the opportunity to share what I've been working on the last few months.
I've hand curated hundreds of MCP servers, which people can access and browse via https://glama.ai/mcp/servers and made those servers available via API https://glama.ai/mcp/reference
The API allows to search for MCP servers, identify their capabilities via API attributes, and even access user hosted MCP servers.
However, you can also try these servers using an inspector (available under every server) and also in the chat (https://glama.ai/chat)
This is all part of a bigger ambition to create an all encompassing platform for authoring, discovering and hosting MCP servers.
I am also the author of https://github.com/punkpeye/fastmcp framework and several other supporting open-source tools, like https://github.com/punkpeye/mcp-proxy
If you are also interested in MCP and want to chat about the future of this technology, drop me a message.
[+] [-] redm|1 year ago|reply
MCP reminds me of a new platform opportunity akin to the Apple App Store.
It's rapidly adopted, with offerings from GitHub, Stripe, Slack, Google Maps, AirTable, etc. Many more non-official integrations are already out there. I expect this will only gain adoption over the coming year.
[+] [-] fallinditch|1 year ago|reply
But with MCP there's not a whole lot of information out there for LLMs to digest and so perhaps for that reason the article is not particularly insightful.
Thank you HN for bringing the insights!
[+] [-] norsak|1 year ago|reply
Appreciate the feedback - brb I'll update the post to include this!
[+] [-] 1116574|1 year ago|reply
[+] [-] SamBam|1 year ago|reply
I honestly think most of the article was written by an LLM.
[+] [-] doug_durham|1 year ago|reply
[+] [-] norsak|1 year ago|reply
[+] [-] chrislloyd|1 year ago|reply
MCP is probably easier for clients to implement but suffers from poor standardization, immaturity and non-human readability. It clearly scratches an itch but I think it’s a local-minimum that requires a tremendous amount of work to implement.
[+] [-] nsonha|1 year ago|reply
[+] [-] rsp1984|1 year ago|reply
So all that's needed are API docs. Or what am I missing?
[+] [-] frabjoused|1 year ago|reply
[+] [-] muzani|1 year ago|reply
Let's say you want to add or delete Jira tickets. A MCP is like a big labeled button for the AI to do this, and it doesn't come with the token cost of reading an API or the possibility of making a mistake while accessing it.
[+] [-] nsonha|1 year ago|reply
[+] [-] smallnix|1 year ago|reply
The value of MCP then depends on it's adoption. If I need to write an MCP adapter for everything, it's value is little. If everyone (API owners, OS, Clouds, ...) puts in the work to have an MCP compatible interface it's valuable.
In a world where I need to build my own X-to-USB dongle for every device myself, I wouldn't use USB, to stay with the articles analogy.
[+] [-] whazor|1 year ago|reply
Normally, LSP when running on a remote server, you would use a continuous (web)socket instead of API requests. This helps with the parsing overhead and provides faster response for small requests. Also requests have cancellation tokens, which makes it possible to cancel a request when it became unnecessary.
[+] [-] shan-chang|1 year ago|reply
While similar to MCP, ANP is significantly different. ANP is specifically designed for agents, addressing communication issues encountered by intelligent agents. It enables identity authentication and collaboration between any two agents.
Key differences include:
ANP uses a P2P architecture, whereas MCP follows a client-server model. ANP relies on W3C DID for decentralized identity authentication, while MCP utilizes OAuth. ANP organizes information using Semantic Web and Linked Data principles, whereas MCP employs JSON-RPC. MCP might excel at providing additional information and tools to models and connecting models to the existing web. In contrast, ANP is particularly effective for collaboration and communication between agents.
Here is a detailed comparison of ANP and MCP (including the GitHub repository): https://github.com/agent-network-protocol/AgentNetworkProtoc...
[+] [-] johnjungles|1 year ago|reply
- slack or comment to linear/Jira with a summary of what I pushed
- pull this issue from sentry and fix it - pull this linear issue and do a first pass
- pull in this Notion doc with a PRD then create an API reference for it based on this codebase, then create a new Notion page with the reference
MCP tools are what the LLM uses and initiates
MCP prompts are user initated workflows
MCP resources is the data that the APIs provide and structure of that data (because porting APIs to MCPs are not as straight forward) Anyways please give me feedback!
[+] [-] cloudking|1 year ago|reply
[+] [-] starik36|1 year ago|reply
I've played a lot with the FileSystem MCP server but couldn't get it to do something useful that I can't already do faster on my own. For instance, asking it how many files have word "main" in it. It returns 267, but in reality there are 12k.
Looks promising, but I am still looking for useful ways to integrate it into my workflow.
[+] [-] josvdwest|11 months ago|reply
[+] [-] cloudking|1 year ago|reply
In Cursor for example, it gives the agent the ability to connect to the browser to gather console logs, network logs and take screenshots. The agent will often invoke the tools automatically when it is debugging or verifying it's work, without being explicitly prompted to do so.
It's a little bit of a set up process, as it requires a browser extension on top of the MCP configuration.
[+] [-] jjfoooo4|1 year ago|reply
So, now when Roo Code does tasks for me, it takes notes and searches memory.
It’s good as a means to get a quick POC running, for dev oriented use cases.
I have seen very few implementations that use anything but the tools capabilities though.
The complete lack of auth consideration and the weird orchestration (really the “client” manages its own “server” processes), make me doubt it’s going to get serious adoption in prod. It’s not something I’d have a lot of confidence in supporting for non dev users.
[+] [-] evalstate|1 year ago|reply
I wrote mcp-hfspace to let you connect to Hugging Face Spaces; that opens up a lot of image generation, vision, audio transcription and other services that can be integrated quickly and easily in to your Host app.
[+] [-] gsibble|1 year ago|reply
[+] [-] aeon_seraph|1 year ago|reply
[+] [-] nsonha|1 year ago|reply
[+] [-] jameslk|1 year ago|reply
[+] [-] paulgb|1 year ago|reply
[+] [-] speakbrightly|1 year ago|reply
[+] [-] ozten|1 year ago|reply
[+] [-] risyachka|1 year ago|reply
Regular SDK lib: - Integration Effort: just like MCP - Real-Time Communication - Sure - Dynamic Discovery - obviously. just call refresh or whatever - Scalability - infinite, it is a library - Security & Control - just like mcp
i trully don't get it
[+] [-] StressedDev|1 year ago|reply
[+] [-] mirekrusin|1 year ago|reply
Following those two principles means your implementation ends up as simple class, with simple methods, with simple params - possibly using decorators to expose it as rpc and perform runtime type assertion for params (exposing rpc, server side) and result (using rpc, client side) – consuming jsonrpc now looks like using any ordinary library/package that happens to have async methods (this is important, there is no special dialect of communication, it's all ordinary semantics everybody is already used to, your code on client and server side doesn't jump between mapping to/from language and jsonrpc, there is a lot of complexity that's collapsed, code looks minimal, it's small, natural to read etc).
Notifications also map naturally to well established pattern (ie. event emitter in nodejs).
And yes, that's my main criticism of MCP – you're making standard for communication meant to be used from different languages, why adding this silly, unnecessary complexity by using "/" in method names? It frankly feels like amateur mistake by somebody who thinks it should be a bit like REST where method is URL path.
Another tangent – this declaration of available enpoints is unnecessarily complicated – you can just use url: file://.. scheme to start process on that executable with stdin/stdout as communication channels (this idea is great btw, good job!), ws:// or wss:// for websocket comms to existing service and http:// or https:// for jsonrpc over http (no notifications).
[+] [-] flowerthoughts|1 year ago|reply
[+] [-] emrah|1 year ago|reply
Ok but why would every app and website implement this new protocol for the benefit of LLMs/agents?