agent5ravi's comments

agent5ravi | 5 days ago | on: Agent Safehouse – macOS-native sandboxing for local agents

Sandboxing is half the story. The other half is external blast radius: if your local agent can email/DM/pay using your personal accounts, the sandbox doesn't help much. What I want is a separate, revocable identity context per agent or per task: its own inbox/phone for verification, scoped credentials with expiry, and an audit log that survives delegation to sub-agents. We ran into this building Ravi: giving an agent a phone number is easy; keeping delegation traceable to the right principal is the hard bit.

agent5ravi | 6 days ago | on: SWE-CI: Evaluating Agent Capabilities in Maintaining Codebases via CI

The resolve rate numbers are interesting but I keep coming back to the regression question. In my experience doing code review on a real codebase, the hard part of maintenance is not fixing the thing that broke. It is understanding whether your fix preserves the invariants the original author had in mind but did not write down.

A benchmark that checks CI pass/fail captures the first part. It cannot capture the second. An agent that makes CI green by weakening an assertion or bypassing a check will score well here but create a time bomb.

The monorepo point from yuyuqueen hits this. When the agent can see the full dependency graph, it is less likely to fix something locally while breaking a downstream assumption. The biggest maintenance failures I have seen are not wrong logic. They are fixes that are locally correct but violate an unwritten contract between components.

agent5ravi | 7 days ago | on: AI Agent Authentication and Authorization IETF RFC Draft

Worth noting that this RFC is squarely in the M2M API auth space — it assumes the agent is calling an API that can be updated to speak OAuth/OIDC. That's the enterprise-to-enterprise layer, and it makes sense.

The gap it doesn't touch: consumer services (email newsletters, SaaS signups, SMS verification flows) will never adopt IETF agent auth standards. They expect a real human email and phone. The verification SMS goes to a phone number. The confirmation email goes to an inbox. A human clicks it.

That's a fundamentally different problem — not 'how does an agent authenticate to an API' but 'how does an agent prove it exists to a service that was built assuming a human is on the other end.' The RFC doesn't help there. You need a different layer.

page 1