top | item 47041642

(no title)

API-level spend caps solve the "how much" problem but not the "why." The agent still loops 50 times before hitting the limit. You just lose $50 instead of $200.

The missing layer is detection inside the agent loop itself. If the agent is calling search_kb for the 8th time with slightly different args, or about to issue a refund it already issued, you want to catch that at iteration 3, not at the dollar ceiling.

I built an open-source middleware called Aura Guard (https://github.com/auraguardhq/aura-guard) that does exactly this. It sits in the agent loop and detects repeated tool calls, argument jitter, duplicate side-effects, stall patterns, and budget overruns. When it catches a loop it can rewrite the prompt, return a cached result, or escalate instead of letting the agent spin until an external limit kills it.

Zero dependencies, framework-agnostic, works with any LLM provider. Has a shadow mode so you can see what it would catch without blocking anything.

Your approach and this are complementary. Spend caps at the proxy level, loop detection at the agent level. Both are needed if you're running agents in production.

discuss

No comments yet.