top | item 42486998

(no title)

I’d be curious to hear more on Attempt 2. It sounds like the approach was basically to ask an llm for a score for each comment. Adding specifics to this prompt might go a long way? Like, what specifically is the rationale for this change, is this likely to be a functional bug, is it a security issue, how does it impact maintainability over the long run, etc.; basically I wonder if asking about more specific criteria and trying to define what you mean by nits can help the LLM give you more reliable scores.

discuss

dakshgupta|1 year ago

That’s an interesting point - we didn’t try this. Now that you said that, I bet even clearly defining what each number on the scale means would help.