(no title)
iknownothow | 8 months ago
I wouldn't be surprised if someone's building a dataset for tool use examples.
The newer gen reasoning models are especially good at knowing when to do web search. I imagine they'll slowly get better at other tools.
At current levels of performance, LLMs having the ability to get well curated information by themselves would increase their scores by a lot.
No comments yet.