Show HN: Bulwark – Open-source governance layer for AI agents (Rust, MCP-native)
2 points| bpolania | 14 days ago |github.com
I built Bulwark because I kept running into the same problem: I need to give AI agents access to my GitHub token, my AWS credentials, my database access, etc. They can do anything I can do. And when something goes wrong, there's no audit trail.
Bulwark is a governance proxy that sits between agents and the tools they call. It works as an MCP gateway (for Claude Code, OpenClaw) or as an HTTP forward proxy (for Codex, curl). Every request goes through:
Session validation → Content inspection → Policy evaluation → Credential injection → Audit logging
The key ideas:
- Agents never see real secrets. They authenticate with a scoped session token. Bulwark injects the real credentials at the last mile, based on tool pattern + scope.
- Policies are YAML. Glob matching, scope-based precedence, hot-reload. Default deny. You can preview the impact of policy changes against real audit history before deploying (bulwark policy test).
- Tamper-evident audit. Every event is blake3 hash-chained in SQLite. You can reconstruct exactly what an agent did and verify nothing was modified.
- Content inspection. 13 built-in patterns scan both directions for AWS keys, PII, prompt injection. Redaction happens before content reaches the agent.
Technical details: 11 Rust crates, 409 tests, zero clippy warnings. Policy evaluation is sub-millisecond (in-memory, lock-free hot-reload via ArcSwap). Credentials encrypted with age. Built on hyper 1.x, rustls, tokio.
Install: brew install bpolania/tap/bulwark
The README has a 5-minute quickstart that connects Claude Code to GitHub through Bulwark. Happy to answer questions about the architecture, threat model, or MCP integration.
Aristarkh|14 days ago
verdverm|14 days ago
bpolania|14 days ago
umairnadeem123|14 days ago
[deleted]
bpolania|14 days ago
On idempotency: right now Bulwark observes but doesn't enforce dedupe. Every request gets a unique event ID in the audit log, and you can see retries in the session timeline, but there's no automatic "this looks like the same create_issue call from 2 seconds ago, block it."
It's on the roadmap and I think it needs to be two things: (1) a configurable dedupe window per tool pattern (you want it for create_charge but not for list_issues), and (2) content-aware hashing so it's not just "same tool + same action" but "same tool + same action + same arguments within N seconds."
The tricky part is that some tools are intentionally non-idempotent, posting the same Slack message twice might be deliberate. So it probably needs to be opt-in per rule rather than global. Would love to hear what patterns you've seen cause the worst double-fires.