(no title)
ryanrasti | 22 days ago
I'm working on a systems-security approach (object-capabilities, deterministic policy) - where you can have strong guarantees on a policy like "don't send out sensitive information".
Would love to chat with anyone who wants to use agents but who (rightly) refuses to compromise on security.
rellfy|22 days ago
I can only think of two ways to address it:
1. Gate all sensitive operations (i.e. all external data flows) through a manual confirmation system, such as an OTP code that the human operator needs to manually approve every time, and also review the content being sent out. Cons: decision fatigue over time, can only feasibly be used if the agent only communicates externally infrequently or if the decision is easy to make by reading the data flowing out (wouldn't work if you need to review a 20-page PDF every time).
2. Design around the lethal trifecta: your agent can only have 2 legs instead of all 3. I believe this is the most robust approach for all use cases that support it. For example, agents that are privately accessed, and can work with private data and untrusted content but cannot externally communicate.
I'd be interested to know if you have reached similar conclusions or have a different approach to it?
ryanrasti|22 days ago
The third path: fine-grained object-capabilities and attenuation based on data provenance. More simply, the legs narrow based on what the agent has done (e.g., read of sensitive data or untrusted data)
Example: agent reads an email from alice@external.com. After that, it can only send replies to the thread (alice). It still has external communication, but scope is constrained to ensure it doesn't leak sensitive information.
The basic idea is applying systems security principles (object-capabilities and IFC) to agents. There's a lot more to it -- and it doesn't solve every problem -- but it gets us a lot closer.
Happy to share more details if you're interested.
trenchgun|22 days ago
eek2121|22 days ago
Realistically though, these agents are going to need access to at least SOME of your data in order to work.
veganmosfet|22 days ago
sumitkumar|22 days ago