Interesting, do you have any hunch as to why this is? We've seen in more verticalized apps where the underlying model is hidden from the user (sales call agent, autopilot tool, support agent etc.) that trying to reach high quality on hard prompts and high speed on the remaining prompts makes routing an appealing option.
weird-eye-issue|1 year ago
For something like a support agent why couldn't the company just choose a model like GPT-4o and stick with one? Would they really trust some responses going to 3.5 (or similar)?
danlenton|1 year ago
Still, I agree that routing in isolation is not thaaat useful in many LLM domains. I think the usefulness will increase when applying to multi-step agentic systems, and when combining with other optimizations such as end-to-end learning of the intermediate prompts (DSPy etc.)
Thanks again for diving deep, super helpful!