What all these tools need to adopt is sending 10-20 requests out and finding the "best" response. I think it's incorrect that we try to get the tool to work right the first time. Auto-GPT has JSON parse errors 20-50% of the time. Instead, with enough parallel responses we can increase the likelihood one of them is "really good". The next challenge is figuring out which answer is really good and continuing with that.
typon|2 years ago
mpaepper|2 years ago
Some kind of ensemble agent which is more robust, might play with that idea.
sharemywin|2 years ago