(no title)
danlenton | 1 year ago
On our roadmap, we plan to support:
- an API which returns the neural scores directly, enabling model selection and model-specific prompts to all be handled on the client side
- automatic learning of intermediate prompts for agentic multi-step systems, taking a similar view as DSPy, where all intermediate LLM calls and prompts are treated as latent variables in an optimizable end-to-end agentic system.
With these additions, the subtleties of the model + prompt relationship would be better respected.
I also believe that LLMs will become more robust to prompt subtleties over time. Also, some tasks are less sensitive to these minor subtleties you refer to.
For example, if you have a sales call agent, you might want to optimize UX for easy dialgoue prompts (so the person on the other end isn't left waiting), and take longer thinking about harder prompts requiring the full context of the call.
This is just an example, but my point is that not all LLM applications are the same. Some might be super sensitive to prompt subtleties, others might not be.
Thoughts?
weird-eye-issue|1 year ago
It's already hard enough to get consistent behavior with a fixed model
If we need to save money we will switch to a cheaper model and adapt our prompts for that
If we are going more for quality we'll use and more expensive model and adapt our prompts for that
I fail to see any use case where I would want a third party choosing which model we are using at run time...
We are adding a new model this week and I've spent dozens of hours personally evaluating output and making tweaks to make it feasible
Making it sound like models are interchangeable is harmful
danlenton|1 year ago