(no title)
slimshetty | 1 year ago
We present R2E, a scalable framework that turns any GitHub repository into an environment for programming agents. These environments can be used to benchmark programming agents that can interact with interpreters on repository-level problems. The system is designed to be scalable and can be used to evaluate code generation, optimization, and refactoring on public and _private_ repos. Further, R2E also enables the collection of large-scale execution traces to improve LLMs themselves.
No comments yet.