niyikiza | 3 days ago | on: Agent Safehouse – macOS-native sandboxing for local agents
niyikiza's comments
niyikiza | 16 days ago | on: NIST Seeking Public Comment on AI Agent Security (Deadline: March 9, 2026)
(Disclaimer: working on this problem at tenuo.ai)
niyikiza | 21 days ago | on: OpenClaw is dangerous
niyikiza | 1 month ago | on: AI is killing B2B SaaS
niyikiza | 1 month ago | on: Kimi K2.5 Technical Report [pdf]
niyikiza | 1 month ago | on: The Hallucination Defense
niyikiza | 1 month ago | on: The Hallucination Defense
At execution time, the "verifier" checks the warrant: valid signatures, attenuation (scope only narrows through delegation), TTL (authority is task-scoped), and that the action fits the constraints. Only then does the call proceed.
This is sometimes called the P/Q model: the non-deterministic layer proposes, the deterministic layer decides. The agent can ask for anything. It only gets what's explicitly granted.
If the agent asks for the wrong thing, it fails closed. If an overly broad scope is approved, the receipt makes that approval explicit and reviewable.
niyikiza | 1 month ago | on: The Hallucination Defense
You keep a coarse cap (e.g. email read/write, invoice pay) but each task runs under a narrower, time-boxed warrant derived from that cap. Narrowing happens at the policy/UX layer (human or deterministic rules), not by the LLM. The LLM can request escalation ("need send"), but it only gets it via an explicit approval / rule.
Crypto isn't deciding scope. It's enforcing monotonic attenuation, binding the grant to an agent key, and producing a receipt that the scope was explicitly approved.
For a single-process agent this might be overkill. It matters more when warrants cross trust boundaries: third-party tools, sub-agents in different runtimes, external services. Offline verification means each hop can validate without calling home
niyikiza | 1 month ago | on: The Hallucination Defense
But warrants aren't just "more audit data." They're an authorization primitive enforced in the critical path: scope and constraints are checked mechanically before the action executes. The receipt is a byproduct.
Prompt logs tell you what the model claimed it was doing. A warrant is what the human actually authorized, bound to an agent key, verifiable without trusting the agent runtime.
This matters more in multi-agent systems. When Agent A delegates to Agent B, which calls a tool, you want to be able to link that action back to the human who started it. Warrants chain cryptographically. Each hop signs and attenuates. The authorization provenance is in the artifact itself.
niyikiza | 1 month ago | on: Ask HN: How are you handling non-probabilistic security for LLM agents?
Different angle than policy-as-YAML. We use cryptographic capability tokens (warrants) that travel with the request. The human signs a scoped, time-bound authorization. The tool validates the warrant at execution, not a central policy engine.
On your questions:
Canonicalization: The warrant specifies allowed capabilities and constraints (e.g., path: /data/reports/*). The tool checks if the action fits the constraint. No need to normalize LLM output into a canonical representation.
Stateful intent: Warrants attenuate. Authority only shrinks through delegation. You can't escalate from "read DB" to "POST external" unless the original warrant allowed both. A sub-agent can only receive a subset of what its parent had, cryptographically enforced.
Latency: Stateless verification, ~27μs. No control plane calls. The warrant is self-contained: scope, constraints, expiry, holder binding, signature chain. Verification is local.
The deeper issue with policy engines: they check rules against actions, but they can't verify derivation. When Agent B acts, did its authority actually come from Agent A? Was it attenuated correctly?
Wrote about why capabilities are the only model that survives dynamic delegation: https://niyikiza.com/posts/capability-delegation/
niyikiza | 1 month ago | on: The Hallucination Defense
The LLM can request a narrower scope, but attenuation is monotonic and enforced cryptographically. You can't sign a delegation that exceeds what you were granted. TTL too: the warrant can't outlive its parent.
So yes, key management. But the pathological "Allow: *" has to originate from a human who signed it. That's the receipt you're left holding.
You're poking at the right edges though. UX for scope definition and revocation propagation are what we're working through now. We're building this at tenuo.dev if you want to dig in the spec or poke holes.
niyikiza | 1 month ago | on: The Hallucination Defense
niyikiza | 1 month ago | on: The Hallucination Defense
When orchestrators spawn sub-agents spawn tools, there's no artifact showing how authority flowed through the chain.
Warrants are a primitive for this: signed authorization that attenuates at each hop. Each delegation is signed, scope can only narrow, and the full chain is verifiable at the end. Doesn't matter how many layers deep.
niyikiza | 1 month ago | on: The Hallucination Defense
niyikiza | 1 month ago | on: The Hallucination Defense
niyikiza | 1 month ago | on: The Hallucination Defense
Yeah that's exactly the I think we should adopt for AI agent tool calls as well: cryptographically signed, task scoped "warrants" that can be traceable even in cases of multi-agent delegation chains
niyikiza | 1 month ago | on: The Hallucination Defense
niyikiza | 1 month ago | on: The Hallucination Defense
And when sub-agents or third-party tools are involved, liability gets even murkier. Who's accountable when the action executed three hops away from the human? The article argues for receipts that make "I didn't authorize that" a verifiable claim
niyikiza | 1 month ago | on: Semantic Attacks: Exploiting What Agents See
Correction: I accidentally submitted the Substack link instead of the full technical write-up. You can read the complete post with all the attack vectors here: https://niyikiza.com/posts/semantic-attacks/
We stumbled on these vectors while building an authorization protocol for agents.
Everyone seems focused on "Prompt Injection" (the brain), but the perception integrity seems to be under discussed. I look at agents like pilots flying on instruments: if the DOM feeds them false data, no amount of reasoning or prompt engineering can prevent the crash.
This post breaks down the specific ways attackers can compromise those instruments without touching the prompt.
niyikiza | 1 month ago | on: Show HN: Gambit, an open-source agent harness for building reliable AI agents
The gap I keep coming back to is that even at Layer 6, enforcement is probabilistic. You are still negotiating with the model's weights. "Less likely to fail" is great for reliability, but hard to sell on a security questionnaire.
Tenuo operates at the execution boundary. It checks after the model decides and before the tool runs. Even if the model gets tricked (or just hallucinates), the action fails if the cryptographic warrant doesn't allow that specific action.
Re: Hypercore/P2P, I actually see that as the identity layer we're missing. You need a decentralized root of trust (Provenance) to verify who signed the Warrant (Authorization). Tenuo handles the latter, but it needs something like Hypercore for the former.
Would be curious to see how Gambit's Deck pattern could integrate with warrant-based authorization. Since you already have typed inputs/outputs, mapping those to signed capabilities seems like a natural fit.