Shameless plug, for anyone who's interested in "self-improvement" agent check out StreamBench[1] where we benchmark and try out what's essential for improvements in online settings. Basically we find feedback signal is vital and the stronger the signal the more improvement you can get if you were able to feed it back to the agent in terms of weights (LoRA) or in-context examples.[1] https://arxiv.org/abs/2406.08747
No comments yet.