biobootloader's comments

biobootloader | 2 years ago | on: Benchmarking GPT-4 Turbo – A Cautionary Tale

Hey Paul, I'm a Mentat author.

> I also notice that the instructions prompt that mentat uses seems to be inspired by the aider benchmark? Glad to see others adopting similar benchmarking approaches.

We were inspired by you to use Exercism as a benchmark, thank you! We will add attribution for that. We switched our original instruction prompts for that benchmark to be similar to Aiders to allow for fair comparison.

> After looking around a bit, there seems to be a bunch of aider code in your repo. Some attribution would be appreciated.

We have an unused implementation of your output response format (https://github.com/AbanteAI/mentat/blob/main/mentat/parsers/...), but I don't know what else you are seeing? We implemented that to compare with our response formats and didn't find much difference in performance.

page 1