top | item 47075730

(no title)

chermi | 10 days ago

You can really notice the tool use problems. They gotta get on that. The agent trend seems real, and powerful. They can't afford to fall behind on it.

discuss

order

verdverm|10 days ago

I don't really have tool usage issues that I don't put under that doesn't follow system prompt instructions consistently

there are these times where it puts a prefix on all function calls, which is weird and I think hallucination, so maybe that one

3.1 hopefully fixes that

HardCodedBias|10 days ago

"They can't afford to fall behind on it."

They are very, very seriously far behind as of 3.0.

We'll see if 3.1 addresses the issue at all.