top | item 47211661

(no title)

mlyle | 21 hours ago

It looks almost entirely envisioned and implemented by AI.

An agent signing a covenant doesn't do anything. You're not going to enforce a contract against it, and there's not some kind of non-repudiation problem to solve.

Enforcing behavioral covenants or boundaries is inherent to how you make things safe. But how do you really do it for anything that matters? How do you make sure that an agent isn't discriminating based on race or other factors?

The whole reason you're using an LLM is because you're doing something either:

A) at very low scale, at which case it's hard to capture sufficient covenants cost-efficiently

or B) with very great complexity, where the behavior you want is hard to encapsulate in code-- in which case meaningful enforcement of the complex covenants that may result is hard.

Indeed, if you could just write code to do it, you'd just write code to do it.

I'm glad you're interested in these issues and playing with them. I'll leave you with one last thought: 134 KSLOC is a bug, not a feature. Some software systems that need to be huge, but for software systems that need to be trusted-- small, auditable, and understandable to humans (and agents) is the key thing you're looking for. Could you build some kind of small trustable core that solves a simple problem in an understandable way?

discuss

nobulexdev|21 hours ago

You're right with the 134K point. The actual cryptographic kernel (covenant building, verification, hash-chaining) is just about 3-4K lines. The rest are just adapters, plugins and test harnesses. I should lead with that number. With enforcement, the covenant itself isn't the enforcement. Middleware intercepts tool calls before the execution and blocks the violations. But you're right that this only works for constraints you can express as rules. "No external calls" and "rate limit 100/hour" are enforceable. "Don't discriminate" is not — that's a fundamentally harder problem and I'm not pretending that it solves it. The small trustable core advice is truly good and probably what I should focus on next. Thank you.

mlyle|21 hours ago

Why does whether the agent "commits" to a rule cryptographically matter?

Surely it's just the enforcement, and maybe the measuring of sentinel events -- how far does it wander off course.

How is cryptography an important part of this, given that we're talking about a layer that sits on top of an LLM without an adversary in-between?

I know you mention non-repudiation, but ... there's no kind of real non-repudiation here in this environment.