top | item 45132434

(no title)

Nice framing for PMs, but technically it is way too rosy. MCP is real but still full of low utility services and security issues, so “skills as plug-ins” is not production ready. A2A protocols were only just announced this year (Google, etc.) and actual inter-agent interoperability is still research grade, with debugging across agents being a nightmare. Orchestration layers (skills, workflows, multi-agent) look clean in diagrams but turn into brittle state machines under load. LLM “confidence scores” are basically uncalibrated logits dressed up as probabilities.

In short: nice industry roadmap, but we are nowhere near robust, trustworthy multi-agent systems yet.

discuss

gabriel666smith|5 months ago

The idea of giving an LLM with a tool any kind of control over an actual user's account remains (though you put this more elegantly) batshit insane to me.

Even assuming you've correctly auth'd the user contacting you (big assumption!), allowing that user to very literally prompt a 'semi-confident thing with tools' - however many layers of abstraction away the tool is - feels very, very far away from a real-world, sensible implementation right now.

Just shoot the tool prompts over to a human operator, if it's so necessary! Sense-check!