We weren’t able to agree on a good way to measure this. Curious - what’s your opinion on code churn as a metric? If code simply persists over some number of months, is that indication it’s good quality code?
I've seen code persist a long time because it is unmaintainable gloop that takes forever to understand and nobody is brave enough to rebuild it.
So no, I don't think persistence-through-time is a good metric. Probably better to look at cyclomatic complexity, and maybe for a given code path or module or class hierarchy, how many calls it makes within itself vs to things outside the hierarchy - some measure of how many files you need to jump between to understand it
I second the persistence. Some of the most persistent code we own is because it’s untested and poorly written, but managed to become critical infrastructure early on. Most new tests are best-effort black box tests and guesswork, since the creators have left a long time ago.
Of course, feeding the code to an LLM makes it really go to town. And break every test in the process. Then you start babying it to do smaller and smaller changes, but at that point it’s faster to just do it manually.
arcwhite|2 months ago
So no, I don't think persistence-through-time is a good metric. Probably better to look at cyclomatic complexity, and maybe for a given code path or module or class hierarchy, how many calls it makes within itself vs to things outside the hierarchy - some measure of how many files you need to jump between to understand it
refactor_master|2 months ago
Of course, feeding the code to an LLM makes it really go to town. And break every test in the process. Then you start babying it to do smaller and smaller changes, but at that point it’s faster to just do it manually.
nerevarthelame|2 months ago
dakshgupta|2 months ago
- Change in number of revisions made between open and merge before vs. after greptile
- Percentage of greptile's PR comments that cause the developer to change the flagged lines
Assuming the author is will only change their PR for the better, this tells us if we're impacting quality.
We haven't yet found a way to measure absolute quality, beyond that.
wordpad|2 months ago