top | item 47055008

(no title)

amirdor | 12 days ago

I use Claude and Codex heavily for coding, and I kept burning through my quota halfway through the week. When I looked at my logs, most of my prompts were things like "summarize this" or "reformat this JSON." Stuff any small model handles fine.

NadirClaw is a Python proxy that classifies each prompt in ~10ms and routes simple ones to Gemini Flash, Ollama, or whatever cheap model you want. Only complex prompts hit your premium API. It's OpenAI-compatible, so you just point your existing tools at it.

In practice I went from burning through my Claude quota in 2 days to having it last the full week. Costs dropped around 60%.

pip install nadirclaw

Still early. The classifier is simple (token count + pattern matching + optional embeddings). Curious what breaks first.

discuss

order

No comments yet.