What are the limitations are there in terms of tasks this can handle? How does this compare with the other products out there? There are plenty of options...
Depends on your set of tasks but we use Engine for the bottom ~50% of issues by complexity. We have a pretty good swe-bench score from a while back but it's got much better since!
We have also focused on workflow integrations so you can assign issues from Linear, Jira, Trello etc which makes it more useful for teams.
This is cool. I can see the anti-monopoly of OpenAI argument, but apart from that is there a strong argument of being multi-LLM for a Codex-like agent?
We often find that some models perform better on certain types of repo. For example Claude 3.5/7 is typically much better at frontends. That's why we let you switch up the model for each repo.
We last submitted a SWE-Bench verified result in November 2024 - at the time I believe we were in the top 5 entrants.
We expect Engine to be as good as the other code-writing agents out there at the moment - we understand almost everyone in the space to be using very similar base models and agent scaffolding.
sudb|9 months ago
atlas_mugged|9 months ago
sdspurrier|9 months ago
We have also focused on workflow integrations so you can assign issues from Linear, Jira, Trello etc which makes it more useful for teams.
diminikolaou|9 months ago
sdspurrier|9 months ago
jackmpcollins|9 months ago
julvo|9 months ago
sdspurrier|9 months ago
simvirdi|9 months ago
sudb|9 months ago
We expect Engine to be as good as the other code-writing agents out there at the moment - we understand almost everyone in the space to be using very similar base models and agent scaffolding.
RHSman2|9 months ago
ca508|9 months ago
FossQuestion|9 months ago
[deleted]
ph94robotics|9 months ago