(no title)
averynicepen | 2 months ago
The more examples of different types of problems being solved in similar ways present in an LLM's dataset, the better it gets at solving problems. Generally speaking, if it's a solution that works well, it gets used a lot, so "good solutions" become well represented in the dataset.
Human expression, however, is diverse by definition. The expression of the human experience is the expression of a data point on a statistical field with standard deviations the size of chasms. An expression of the mean (which is what an LLM does) goes against why we care about human expression in the first place. "Interesting" is a value closely paired with "different".
We value diversity of thought in expression, but we value efficiency of problem solving for code.
There is definitely an argument to be made that LLM usage fundamentally restrains an individual from solving unsolved problems. It also doesn't consider the question of "where do we get more data from".
>the code you actually want to ship is so far from what LLMs write
I think this is a fairly common consensus, and my understanding is the reason for this issue is limited context window.
twodave|2 months ago
mac-attack|2 months ago