The task sounds similar to descriptions in the API space. People figured LLMs would be awesome at annotating API specs with descriptions that are so often missing.
Truth is, everyone is realising it’s a bit the opposite: the LLMs are “holding it wrong”, making a best guess at what the interfaces do without slightly deeper analysis. So instead, you want humans writing good descriptions specifically so the LLM can make good choices as to how to piece things together.It’s possible you could set it off on the labelling task, but anecdotally in my experience it will fail when you need to look a couple levels deep into the code to see how functions play with each other. And again, imo, the big risk is getting a label that _looks_ right, but is actually pretty misleadingly wrong.
Cthulhu_|4 months ago
If it's easy to generate documentation, and / or if documentation is autogenerated, people are also less likely to actually read it. Worse, if that comment is then used with another LLM to generate code, it could do it even wronger.
I think that at this stage, all of the programming best practices will find a new reasoning, LLMs - that is, a well-documented API will have better results when an LLM takes a swing at it than a poorly documented one. Same with code and programming languages, use straightforward, non-magic code for better results. This was always true of course, but for some reason people have pushed that into the background or think of it as a box ticking exercise.
theptip|4 months ago
It’s a lot easier to have someone who knows the code well review a paragraph of text than to ask them to write that paragraph.
Good comments make the code much easier for LLMs to use, as well. Especially in the case where the LLM generated docs would be subtly misunderstanding the purpose.
HPsquared|4 months ago