(no title)
redhale | 1 month ago
Starting back in 2022/2023:
- (~2022) It can auto-complete one line, but it can't write a full function.
- (~2023) Ok, it can write a full function, but it can't write a full feature.
- (~2024) Ok, it can write a full feature, but it can't write a simple application.
- (~2025) Ok, it can write a simple application, but it can't create a full application that is actually a valuable product.
- (~2025+) Ok, it can write a full application that is actually a valuable product, but it can't create a long-lived complex codebase for a product that is extensible and scalable over the long term.
It's pretty clear to me where this is going. The only question is how long it takes to get there.
arkensaw|1 month ago
I don't think its a guarantee. all of the things it can do from that list are greenfield, they just have increasing complexity. The problem comes because even in agentic mode, these models do not (and I would argue, can not) understand code or how it works, they just see patterns and generate a plausible sounding explanation or solution. agentic mode means they can try/fail/try/fail/try/fail until something works, but without understanding the code, especially of a large, complex, long-lived codebase, they can unwittingly break something without realising - just like an intern or newbie on the project, which is the most common analogy for LLMs, with good reason.
namrog84|1 month ago
What if we get to the point where all software is basically created 'on the fly' as greenfield projects as needed? And you never need to have complex large long lived codebase?
It is probably incredibly wasteful, but ignoring that, could it work?
bayindirh|1 month ago
Case in point: Self driving cars.
Also, consider that we need to pirate the whole internet to be able to do this, so these models are not creative. They are just directed blenders.
throwthrowuknow|1 month ago
literalAardvark|1 month ago
This is clear from the fact that you can distill the logic ability from a 700b parameter model into a 14b model and maintain almost all of it.
You just lose knowledge, which can be provided externally, and which is the actual "pirated" part.
The logic is _learned_
unknown|1 month ago
[deleted]
mcfedr|1 month ago
rat9988|1 month ago
You'd need to prove that this assertion applies here. I understand that you can't deduce the future gains rate from the past, but you also can't state this as universal truth.
PunchyHamster|1 month ago
We've been having same progression with self driving cars and they are also stuck on the last 10% for last 5 years
redhale|1 month ago
As long as it can do the thing on a faster overall timeline and with less human attention than a human doing it fully manually, it's going to win. And it will only continue to get better.
And I don't know why people always jump to self-driving cars as the analogy as a negative. We already have self-driving cars. Try a Waymo if you're in a city that has them. Yes, there are still long-tail problems being solved there, and limitations. But they basically work and they're amazing. I feel similarly about agentic development, plus in most cases the failure modes of SWE agents don't involve sudden life and death, so they can be more readily worked around.
theshrike79|1 month ago
Does it matter that 49 of them "failed"? It cost me fractions of a cent, so not really.
If every one of the 50 variants was drawn by a human and iterated over days, there would've been a major cost attached to every image and I most likely wouldn't have asked for 50 variations anyway.
It's the same with code. The agent can iterate over dozens of possible solutions in minutes or a few hours. Codex Web even has a 4x mode that gives you 4 alternate solutions to the same issue. Complete waste of time and money with humans, but with LLMs you can just do it.
sanderjd|1 month ago
And this isn't a pessimistic take! I love this period of time where the models themselves are unbelievably useful, and people are also focusing on the user experience of using those amazing models to do useful things. It's an exciting time!
But I'm still pretty skeptical of "these things are about to not require human operators in the loop at all!".
throwthrowuknow|1 month ago
Scea91|1 month ago
The trend is definitely here, but even today, heavily depends on the feature.
While extra useful, it requires intense iteration and human insight for > 90% of our backlog. We develop a cybersecurity product.
EthanHeilman|1 month ago
> The only question is how long it takes to get there.
This is the question and I would temper expectations with the fact that we are likely to hit diminishing returns from real gains in intelligence as task difficulty increases. Real world tasks probably fit into a complexity hierarchy similar to computational complexity. One of the reasons that the AI predictions made in the 1950s for the 1960s did not come to be was because we assumed problem difficulty scaled linearly. Double the computing speed, get twice as good at chess or get twice as good at planning an economy. P, NP separation planed these predictions. It is likely that current predictions will run into similar separations.
It is probably the case that if you made a human 10x as smart they would only be 1.25x more productive at software engineering. The reason we have 10x engineers is less about raw intelligence, they are not 10x more intelligent, rather they have more knowledge and wisdom.
kubb|1 month ago
By your logic, does it mean that engineers will never get replaced?
fernandezpablo|1 month ago
- (~2022) "It's so over for developers". 2022 ends with more professional developers than 2021.
- (~2023) "Ok, now it's really over for developers". 2023 ends with more professional developers than 2022.
- (~2024) "Ok, now it's really, really over for developers". 2024 ends with more professional developers than 2023.
- (~2025) "Ok, now it's really, really, absolutely over for developers". 2025 ends with more professional developers than 2024.
- (~2025+) etc.
Sources: https://www.jetbrains.com/lp/devecosystem-data-playground/#g...
HarHarVeryFunny|1 month ago
I suspect that the timeline from autocomplete-one-line to autocomplete-one-app, which was basically a matter of scaling and RL, may in retrospect turn out to have been a lot faster that the next LLM to AGI step where it becomes capable of using human level judgement and reasoning, etc, to become a developer, not just a coding tool.
itsthecourier|1 month ago
mjr00|1 month ago
They're definitely better now, but it's not like ChatGPT 3.5 couldn't write a full simple todo list app in 2023. There were a billion blog posts talking about that and how it meant the death of the software industry.
Plus I'd actually argue more of the improvements have come from tooling around the models rather than what's in the models themselves.
[0] eg https://www.youtube.com/watch?v=GizsSo-EevA
blitz_skull|1 month ago
ugurs|1 month ago