(no title)
redfloatplane | 24 days ago
> Section 3.3:
> Besides, since we use the moderately capable DeepSeek-Coder-V2-Lite to filter simple problems, the Pass@1 scores of top models on popular languages are relatively low. However, these models perform significantly better on low-resource languages. This indicates that the performance gap between models of different sizes is more pronounced on low-resource languages, likely because DeepSeek-Coder-V2-Lite struggles to filter out simple problems in these scenarios due to its limited capability in handling low-resource languages.
At the same time I have used Claude Code on an elixir codebase and it's done a great job. But for me, it's undefined that it would have done a worse job if I had picked any other stack.
No comments yet.