top | item 47073778

Show HN: LLM-use – cost-effective LLM orchestrator for agents

2 points| justvugg | 11 days ago

Hi HN, Built llm-use: a lightweight Python toolkit for efficient agent workflows with multiple LLMs. Core pattern: strong model (Claude/GPT-4o/big local) for planning + synthesis; cheap/local workers for parallel subtasks (research, scrape, summarize, extract…). Features: • Mix Anthropic, OpenAI, Ollama, llama.cpp • Smart router: cheap/local first, escalate only if needed (learned + heuristic) • Parallel workers (–max-workers) • Real scraping + cache (BS4 or Playwright) • Offline-first (full Ollama support) • Cost tracking ($ for cloud, 0 local) • TUI chat + MCP server mode • Local session logs Quick example (hybrid):

python3 cli.py exec \ --orchestrator anthropic:claude-3-7-sonnet-20250219 \ --worker ollama:llama3.1:8b \ --enable-scrape \ --task "Summarize 6 recent sources on post-quantum crypto"

Or routed version:

python3 cli.py exec \ --router ollama:llama3.1:8b \ --orchestrator openai:o1 \ --worker gpt-4o-mini \ --task "Explain recent macOS security updates"

MIT licensed, minimal deps, embeddable. Repo: https://github.com/llm-use/llm-use Feedback welcome on: • Routing heuristics you’d find useful • Pain points with agent costs / local vs cloud • Missing integrations? Thanks!

1 comment

amirdor|3 days ago

Cool project. If anyone wants something lighter that works as a drop-in OpenAI proxy (no code changes needed), I built NadirClaw for exactly this. It classifies prompts in ~10ms and routes to cheap/local models automatically. Works with Claude Code, Cursor, aider out of the box. https://github.com/doramirdor/NadirClaw (author)