I think we're going to see a negative impact on the software industry thanks to the LLM hype. There is a metric of LLMs which is hard to measure, and that is something like the quality of the solution, which includes how well the problem is abstracted, and how well the solution is decomposed in such a way that it becomes easily scalable, resilient, etc.
The article shows how this is happening. The examples given are translating code from one programming language to another, explaining a codebase, and generating small solutions to common problems (interview questions). At the end the author jumps to the conclusion that literally anything will be possible via prompting an LLM. This does not necessarily follow, and we could be hitting a wall, if we haven't already.
What LLMs lack is creativity and novel seeking functions. Without this you cannot have an intelligent system. LLMs are effectively smart (like 'smart' in smart phone) knowledge bases. They have a lot encoded knowledge and you can retrieve that knowledge with natural language. Very useful, with many great use cases like learning or even some prototyping (emphasis on some) capabilities.
If LLMs could actually write code as well as a human, even prompting would not be necessary. You could just give it an app, and tell it to improve it, fix bugs, add new features based on usage metrics. I'm sure the industry has tried this, and if it had been successful, we would have already replaced ALL programmers, not just senior programmers at large companies that already have hundreds or thousands of other engineers already.
Yes. These are all the same points I used to believe until recently... in fact the article I write two months earlier was all about LLMs not being able to think like us. I still haven't squared how I can believe both things at the same time. The point of my article was to try to explain why I think otherwise now. Responding to your thoughts in sequence:
- These systems can re-abstract and decompose things just fine. If you want to make it resilient or scalable it will follow whatever patterns you want to give it. These patterns are well known and are definitely in the training data for these models.
- I didn't jump to the conclusion that doing small things will make anything possible. I listed a series of discoveries/innovations/patterns/whatever that we've worked on over the past two years to increase the scale of the programs that can be generated/worked-on with these systems. The point is I'm now seeing them work on systems at the level of what I would generally write at a startup, open source project, or enterprise software. I'm sure we'll get some metrics soon on how functional these are for something like Windows, which, I believe is literally the world's single largest code base.
- "creativity" and novel-seeking functions can be added to the system. I gave a recent example in my post about how I asked it to write three different approaches to integrate two code bases. In the old world this would look like handing a project off to three different developers and seeing what they came up with. You can just brush this all of with "their just knowledge bases" but then you have to explain how a knowledge base can write software that would take a human engineer a month on command. We have developed the principle "hard to do, easy to review" that helps with this, too. Give the LLM-system a task that would be tedious for a human and then make the results easy for a human to review. This allows forward progress to be made on a task at a much-accelerated pace. Finally, my post was about programming... how much creativity do you generally see in most programming teams where they take a set of requirements from the PM and the engineering manager and turn that into a code on a framework that's been handed to them. Or take the analogy back in time... how much creativity is still exhibited in assembly compilers? Once creativity has been injected into the system, it's there. Most of the work is just in implementing the decisions.
- You hit the point that I was trying to make... and what sets something like Amplifier apart from something like Claude Code. You have to do MUCH less prompting. You can just give it an app and tell it to improve it, fix bugs, and add new features based on usage metrics. We've been doing these things for months. Your assertion that "we would have already replaced ALL programmers" is the logical next conclusion... which is why I wrote the post. Take it from someone who has been developing these systems for close to three years now... it's coming. Amplifier will not be the thing that does this... but it shows techniques and patterns that have solved the "risky" parts enough to show the products will be coming.
This place is becoming as much of a dumping ground for vanity blogging as LinkedIn already is. There's no discouragement of accounts like this that have no activity here but self promotion.
Not just you. A lot of people think that, I'm sure.
Not sure what you mean about the organizational abstractions. FWIW, I've worked in five startups (sold one), two innovation labs, and a few corporations for a few years. I feel like I've seen our industry from a lot of different perspectives and am not sure how you imagine being at Microsoft for the past 5 years would warp my brain exactly.
It's not, actually. It's a glimpse into a research project being built openly and made freely, by the engineers building it, to anyone who wants to take a look.
The products will come months from now and will be introduced by the marketing team.
If only AI was not completely and utterly useless for any unique problems for which there isn't extreme amounts of available training data. You know, something any competent programmer knows and has already known for years. And these problems end up being involved in basically every single non-trivial application and after not very long into development on those applications. If only AI didn't very readily and aggressively lead you down very bad rabbit holes when it makes large changes or implementations even on code-bases for which there is ample training data, because that's just the nature of how it works. It doesn't fact check itself, it doesn't compare different approaches, it doesn't actually summarize and effectively utilize the "wisdom of the crowd", it just makes stuff up. It makes up whatever looks the most correct based on its training data, with some randomness added. Turns out that's seriously unhelpful in important ways for large projects with lots of different technical and architectural decisions that have to make tradeoffs and pick a specific road among multiple over and over again.
Really sick and tired of these AI grifters. The bubble needs to pop already so these scammers can go bankrupt and we can get back to a rational market again.
I get it. I've been through cycles of this over the past three years, too. Used a lot of various tools, had a lot of disappointment, wasted a lot of time and money.
But this is the kinda the whole point of my post...
In our system, we added fact checking itself, comparing different approaches, summarizing and effectively utilizing the "wisdom of the crowd" (and it's success over time).
And it made it work massively better for even non-trivial applications.
Also... "scammer and AI grifter"?? Damn dude. It's any early-stage open-source experiment result and, mostly, just talking about how it makes me question whether or not I'll be programming in the future. Nobody's asking for your money.
Yes. Please read it. I'm looking for collaborators. The links in this article point to recent work on Wild Cloud so you can see where it's currently at.
Wild Cloud will is a network appliance that will let you set up a k8s cluster of Talos machines and deploy apps curated apps to it. It's meant to make self-hosting more accessible, which, yes, I think can help solve a lot of data sovereignty issues.
proc0|4 months ago
The article shows how this is happening. The examples given are translating code from one programming language to another, explaining a codebase, and generating small solutions to common problems (interview questions). At the end the author jumps to the conclusion that literally anything will be possible via prompting an LLM. This does not necessarily follow, and we could be hitting a wall, if we haven't already.
What LLMs lack is creativity and novel seeking functions. Without this you cannot have an intelligent system. LLMs are effectively smart (like 'smart' in smart phone) knowledge bases. They have a lot encoded knowledge and you can retrieve that knowledge with natural language. Very useful, with many great use cases like learning or even some prototyping (emphasis on some) capabilities.
If LLMs could actually write code as well as a human, even prompting would not be necessary. You could just give it an app, and tell it to improve it, fix bugs, add new features based on usage metrics. I'm sure the industry has tried this, and if it had been successful, we would have already replaced ALL programmers, not just senior programmers at large companies that already have hundreds or thousands of other engineers already.
vmnb|4 months ago
You seem to share his conviction that you, at least, are not just regurgitating slop.
payneio|4 months ago
- These systems can re-abstract and decompose things just fine. If you want to make it resilient or scalable it will follow whatever patterns you want to give it. These patterns are well known and are definitely in the training data for these models.
- I didn't jump to the conclusion that doing small things will make anything possible. I listed a series of discoveries/innovations/patterns/whatever that we've worked on over the past two years to increase the scale of the programs that can be generated/worked-on with these systems. The point is I'm now seeing them work on systems at the level of what I would generally write at a startup, open source project, or enterprise software. I'm sure we'll get some metrics soon on how functional these are for something like Windows, which, I believe is literally the world's single largest code base.
- "creativity" and novel-seeking functions can be added to the system. I gave a recent example in my post about how I asked it to write three different approaches to integrate two code bases. In the old world this would look like handing a project off to three different developers and seeing what they came up with. You can just brush this all of with "their just knowledge bases" but then you have to explain how a knowledge base can write software that would take a human engineer a month on command. We have developed the principle "hard to do, easy to review" that helps with this, too. Give the LLM-system a task that would be tedious for a human and then make the results easy for a human to review. This allows forward progress to be made on a task at a much-accelerated pace. Finally, my post was about programming... how much creativity do you generally see in most programming teams where they take a set of requirements from the PM and the engineering manager and turn that into a code on a framework that's been handed to them. Or take the analogy back in time... how much creativity is still exhibited in assembly compilers? Once creativity has been injected into the system, it's there. Most of the work is just in implementing the decisions.
- You hit the point that I was trying to make... and what sets something like Amplifier apart from something like Claude Code. You have to do MUCH less prompting. You can just give it an app and tell it to improve it, fix bugs, and add new features based on usage metrics. We've been doing these things for months. Your assertion that "we would have already replaced ALL programmers" is the logical next conclusion... which is why I wrote the post. Take it from someone who has been developing these systems for close to three years now... it's coming. Amplifier will not be the thing that does this... but it shows techniques and patterns that have solved the "risky" parts enough to show the products will be coming.
duxup|4 months ago
Anyway does a Principal Engineer at Microsoft typically code a lot?
gdulli|4 months ago
gloryjulio|4 months ago
payneio|4 months ago
mawadev|4 months ago
payneio|4 months ago
Not sure what you mean about the organizational abstractions. FWIW, I've worked in five startups (sold one), two innovation labs, and a few corporations for a few years. I feel like I've seen our industry from a lot of different perspectives and am not sure how you imagine being at Microsoft for the past 5 years would warp my brain exactly.
urbandw311er|4 months ago
payneio|4 months ago
The products will come months from now and will be introduced by the marketing team.
Madmallard|4 months ago
Really sick and tired of these AI grifters. The bubble needs to pop already so these scammers can go bankrupt and we can get back to a rational market again.
payneio|4 months ago
But this is the kinda the whole point of my post...
In our system, we added fact checking itself, comparing different approaches, summarizing and effectively utilizing the "wisdom of the crowd" (and it's success over time).
And it made it work massively better for even non-trivial applications.
payneio|4 months ago
lunias|4 months ago
https://payne.io/posts/wild-cloud/
It's not abundantly clear what makes a "wild cloud" different than say, "some computers networked together," but I'm eagerly awaiting an update!
payneio|4 months ago
Wild Cloud will is a network appliance that will let you set up a k8s cluster of Talos machines and deploy apps curated apps to it. It's meant to make self-hosting more accessible, which, yes, I think can help solve a lot of data sovereignty issues.
I'm not sure what you mean by "barely programs"