top | item 43930486

(no title)

KaseyZhang | 9 months ago

Hey HN! We’re Kasey and Andy from Osmosis (https://osmosis.ai/). We’ve been playing with MCP recently and wanted to share a lightweight open source model that can connect any MCP client* to any MCP server! Check it out here: https://huggingface.co/osmosis-ai/osmosis-mcp-4b

Right now, the only models that consistently work well for MCP are large and closed-source (e.g. 3.7 Sonnet, Gemini 2.5). Other models struggle with tool-calling consistency, and in particular, there’s a lack of options that can run locally.

We used Dr. GRPO to train Qwen3-4B (w/ VeRL + SGLang for multi-turn tool-calling training) for this purpose, and we were able to get performance parity with Gemini 2.5 Pro on relevant benchmarks like GSM8K (https://i.imgur.com/4RXq2Pm.png).

And since the model is open source and lightweight, that means you can also do further fine-tuning / training that’s fully local and customized to your specific needs.

Let us know what you think, or if there’s anything we can answer!

* Any client that supports Qwen3 models (i.e. it works with OpenRouter, local deployments with Ollama, etc.)

discuss

No comments yet.