top | item 47042470

Show HN: Bulwark – Open-source governance layer for AI agents (Rust, MCP-native)

2 points| bpolania | 14 days ago |github.com

Hi HN!

I built Bulwark because I kept running into the same problem: I need to give AI agents access to my GitHub token, my AWS credentials, my database access, etc. They can do anything I can do. And when something goes wrong, there's no audit trail.

Bulwark is a governance proxy that sits between agents and the tools they call. It works as an MCP gateway (for Claude Code, OpenClaw) or as an HTTP forward proxy (for Codex, curl). Every request goes through:

Session validation → Content inspection → Policy evaluation → Credential injection → Audit logging

The key ideas:

- Agents never see real secrets. They authenticate with a scoped session token. Bulwark injects the real credentials at the last mile, based on tool pattern + scope.

- Policies are YAML. Glob matching, scope-based precedence, hot-reload. Default deny. You can preview the impact of policy changes against real audit history before deploying (bulwark policy test).

- Tamper-evident audit. Every event is blake3 hash-chained in SQLite. You can reconstruct exactly what an agent did and verify nothing was modified.

- Content inspection. 13 built-in patterns scan both directions for AWS keys, PII, prompt injection. Redaction happens before content reaches the agent.

Technical details: 11 Rust crates, 409 tests, zero clippy warnings. Policy evaluation is sub-millisecond (in-memory, lock-free hot-reload via ArcSwap). Credentials encrypted with age. Built on hyper 1.x, rustls, tokio.

Install: brew install bpolania/tap/bulwark

The README has a 5-minute quickstart that connects Claude Code to GitHub through Bulwark. Happy to answer questions about the architecture, threat model, or MCP integration.

8 comments

Aristarkh|14 days ago

Injecting credentials at the last mile is a solid architectural choice for agent security. That said, for long-running autonomous workflows, I worry about the blast radius of "valid" actions occurring in a runaway loop (e.g., spinning up 50 instances sequentially). How does the system handle aggregate containment—do you support circuit breakers or rate limits on top of the policy evaluation? Curious if you're also looking at dynamic risk scoring, where an agent's permissions might degrade automatically if it starts hitting high error rates or unusual patterns.

verdverm|14 days ago

If you can do this with Ai so easily, why do I want to use yours instead of the one my Ai generates?

bpolania|14 days ago

Fair question. Yes, you can absolutely generate a basic proxy with an LLM, the gap is in the stuff that's hard to get right and boring to maintain. Policy hot-reload without dropping in-flight requests (ArcSwap, not "restart the process"). Tamper-evident audit with blake3 hash chains, not just append-only logs. Credential injection where the agent process literally never sees the secret, not env vars. Content inspection that runs bidirectionally without buffering entire responses into memory. Correct TLS MITM for the HTTP proxy mode with dynamic per-host certs. An LLM will generate something that works for a demo. We created 409 tests including property-based testing with proptest, because the failure modes in a security proxy are subtle, off-by-one in glob matching, race conditions in policy reload, Content-Length mismatches after redaction. Same reason, for example, you use nginx instead of asking your AI to write an HTTP server. The first 80% is easy. The last 20% is where credentials leak.

umairnadeem123|14 days ago

[deleted]

bpolania|14 days ago

Yes! You are right about the three primitives and that's basically Bulwark's core loop.

On idempotency: right now Bulwark observes but doesn't enforce dedupe. Every request gets a unique event ID in the audit log, and you can see retries in the session timeline, but there's no automatic "this looks like the same create_issue call from 2 seconds ago, block it."

It's on the roadmap and I think it needs to be two things: (1) a configurable dedupe window per tool pattern (you want it for create_charge but not for list_issues), and (2) content-aware hashing so it's not just "same tool + same action" but "same tool + same action + same arguments within N seconds."

The tricky part is that some tools are intentionally non-idempotent, posting the same Slack message twice might be deliberate. So it probably needs to be opt-in per rule rather than global. Would love to hear what patterns you've seen cause the worst double-fires.