(no title)
KaseyZhang | 9 months ago
Right now, the only models that consistently work well for MCP are large and closed-source (e.g. 3.7 Sonnet, Gemini 2.5). Other models struggle with tool-calling consistency, and in particular, there’s a lack of options that can run locally.
We used Dr. GRPO to train Qwen3-4B (w/ VeRL + SGLang for multi-turn tool-calling training) for this purpose, and we were able to get performance parity with Gemini 2.5 Pro on relevant benchmarks like GSM8K (https://i.imgur.com/4RXq2Pm.png).
And since the model is open source and lightweight, that means you can also do further fine-tuning / training that’s fully local and customized to your specific needs.
Let us know what you think, or if there’s anything we can answer!
* Any client that supports Qwen3 models (i.e. it works with OpenRouter, local deployments with Ollama, etc.)
No comments yet.