top | item 44791508

(no title)

brynary | 7 months ago

This rings similar to a recent post that was on the front page about red team vs. blue team.

Before running LLM-generated code through yet more LLMs, you can run it through traditional static analysis (linters, SAST, auto-formatters). They aren’t flashy but they produce the same results 100% of the time.

Consistency is critical if you want to pass/fail a build on the results. Nobody wants a flaky code reviewer robot, just like flaky tests are the worst.

I imagine code review will evolve into a three tier pyramid:

1. Static analysis (instant, consistent) — e.g using Qlty CLI (https://github.com/qltysh/qlty) as a Claude Code or Git hook

2. LLMs — Has the advantage of being able to catch semantic issues

3. Human

We make sure commits pass each level in succession before moving on to the next.

discuss

dakshgupta|7 months ago

Reading that post sent me down the path to this one. This stack order makes total sense, although in practice it's possible 1-2 merge into a single product with two distinct steps.

The 3. is interesting too - my suspicion is that ~70% of PRs are too minor to need human review as the models get better, but the top 30% will because there will be opinion on what is and isn't the right way to do that complex change.