You need to be twice as smart to debug code as you need to be to write it. So if you write the smartest code you can, then you by definition are too dumb to debug it
I hold KISS above most "Enterprise Patterns" with YAGNI as a close second. I think abstractions should be used to reduce complexity with code as opposed to making it harder to reason about. If a pattern increases complexity in understanding, then it should make sense in more cases than not.
I'm also a fan of feature-oriented project structures. I want the unit test file in or next to the code it's testing. For UI projects, similar with React it's about the component or feature not the type of thing. For APIs I will put request handlers with the feature along with models and other abstractions that go together based on what they fulfill, not the type of class they are.
I consider this practice more intuitively discoverable. You go into a directory for "Users" and you will see functionality related to users... this can be profile crud or the endpoint handlers. Security may or may not be a different feature depending on how you grow your app (Users, Roles, Permissions, etc). For that matter, I'd more often rather curate a single app that does what it needs vs. dozens of apps in a singular larger project. I've seen .Net web projects strewn across 60+ applications in two different solutions before. It took literally weeks to do what should take half a day at most.
All for one website/app to get published. WHY?!? I'm not opposed to smaller/micro services where they make sense either. But keep it all as simple as you possibly can. Try to make what you create/use/consume/produce as simple as you can too. Can you easily use/consume/interact with what you make from a system in $NewLanguage without too much headache? I don't like to have to rely on special libraries being available everywhere.
When I was a kid learning programming, I would skim through the whole book teaching Python and type the code using as much keywords as I learned each day, just to boast on my parents and my non-programmers peers about the obfuscated mess that came after.
As I grew I started to contribute to other open-source projects and I came across every kind of unmaitanable spaghetti code, so that I just gave up contribuiting on said project, that is when I gained the consciousness about being zealous on keeping the code as simple as possible so that the next person who comes after me to change the code don't have as much trouble understanding the code, even myself when I revisit the code later.
That altruistic mindset about caring how others read your code, you don't acquire easily unless you get experience how your previous peers did feel.
Writing simple code is also much harder then writing complicated code. If you write some complicated code at the limit of your mental capabilities you can not debug it, but you might also not be smart enough to write the simple code.
I guess this means that one should solve the appropriate problems for a given skill level
This is not just about writing code, but designing systems. If you design the smartest (most complex) system you can, you won't be able to debug/fix/extend/maintain it.
This is a challenge which I don't think AI tools like Cursor have cracked yet. They're great for laying "fresh pavement" but it's akin to being a project manager contracting the work out.
Even if I use Cursor (or some other equivalent) and review the code I find my mental model of the system is much more lacking. It actually had a net negative on my productivity as it gave me anxiety at going back to the codebase.
If an AI tool could help a user interactively learn the mental model I think that would be a great step in the right direction.
but it's akin to being a project manager contracting the work out.
And that's probably the difference between those who are okay with vibe coding and those who aren't. A leader of a company that doesn't care about code quality (elegant code, good tradeoffs, etc) would never have cared if 10 monkeys outputted the code pre-AI or if 10 robot monkeys outputted the code with AI. It's only a developer, of a certain type, that would care to say "pause" in either of those situations.
Out of principal I would not share or build coding tools for these people. They literally did not care all these years about code quality, and the last thing I want to do is enable them on any level.
This is not unique to the age of LLMs. PR reviews are often shallow because the reviewer is not giving the contribution the amount of attention and understanding it deserves.
With LLMs, the volume of code has only gotten larger but those same LLMs can help review the code being written. The current code review agents are surprisingly good at catching errors. Better than most reviewers.
We'll soon get to a point where it's no longer necessary to review code, either by the LLM prompter, or by a second reviewer (the volume of generate code will be too great). Instead, we'll need to create new tools and guardrails to ensure that whatever is written is done in a sustainable way.
> We'll soon get to a point where it's no longer necessary to review code, either by the LLM prompter, or by a second reviewer (the volume of generate code will be too great). Instead, we'll need to create new tools and guardrails to ensure that whatever is written is done in a sustainable way.
The real breakthrough would be finding a way to not even do things that don’t need to be done in the first place.
90% of what management thinks it wants gets discarded/completely upended a few days/weeks/months later anyway, so we should have AI agents that just say “nah, actually you won’t need that” to 90% of our requests.
> We'll soon get to a point where it's no longer necessary to review code, either by the LLM prompter, or by a second reviewer (the volume of generate code will be too great). Instead, we'll need to create new tools and guardrails to ensure that whatever is written is done in a sustainable way.
This seems silly to me. In most cases, the least amount of work you can possibly do is logically describe the process you want and the boundaries, and run that logic over the input data. In other words, coding.
The idea that we should, to avoid coding or reading code, come up with a whole new process to keep generated code on track - would almost certainly take more effort than just getting the logical incantations correct the first time.
One thing to take into account is that PR reviews aren't there for just catching errors in the code. They also ensure that the business logic is correct. For example, you can have code that pass all tests, and look good, but they don't align with the business logic.
I think the same could be said of anything resembling technical writing. As an example aside from code writing, I think more than half of the machine learning papers out there are horribly written in the sense they rush a point or give no rhyme or reason for certain parts
And the best part, most people shallow read all of them and decide the details are needless till they are forced to deal with the details and then their understanding falls apart in front of them
I have the opposite experience. After years in appsec and pentesting, I can read any codebase and quickly understand its parts, but I wouldn’t be able to write anything of production quality. LLMs speed the comprehension process up for me even further. I guess it comes down to practice, if you practice reading code, you get good at reading code.
Maybe you are used to read high quality code. I suspect that the simple fact that you are auditing some code means that someone actually cares, making it higher quality than average.
High quality code is generally hard to write and easy to read.
reading production code that is known to work can be done with faith and skimming. You don't have to understand every function call because they've each been tested and battle hardened, so it's easy to get an overview of what is happening.
LLM code is NOT like this at all, but it's like a skilled liar writing something that LOOKS plausible, that's what they're trained to do.
People like you do not have the ability to evaluate the LLM output; it's not the same as reading code that was carefully written at ALL. If you think it's the same, that is only evidence that you can't tell the difference between working code and misleading buggy code.
What you've learned to do is read the intent of code. That's fine when it's been written and tested by a person. It's useless when it comes to evaluating LLM slop.
Has anyone read The Programmer's Brain and have an opinion about it? I'd like to improve my ability to read and understand code and was thinking about reading it.
I am really bad at reading code to be honest (especially other people's code). Any tips on how I can go about becoming good at this like starting from baby steps?
#1 is easy, #2 requires some investigation, #3 requires studying.
If you're looking at say, banking code - but you know nothing about finance - you may struggle to understand what it's doing. You may want to gain some domain expertise. Being an SME makes reading the related code a heck of a lot easier.
Context comes down to learning the code base. What code calls the part you're looking at? What user actions trigger it? Look at the comments and commit messages - what was the intention? This just takes time and a lot of trawling around, looking for patterns and common elements. User manuals and documentation can also help. This part can't be rushed - it just comes to passing over it again and again and again. If you have access to people very familiar with the code - ask them! They may be able to kick start your intro.
Read code, read code, read code. You will get better.
When looking at a piece of code, keep asking questions like: what does this return, what are the side effects, what can go wrong, what happens if this goes wrong, where do we exit, can this get stuck, where do we close/save/commit this, what's the input, what if the input is wrong/missing, where are we checking if the input is OK, can this number underflow/overflow, etc
All these questions are there to complete the picture, so that instead of function calls and loops, you are looking at the graph of interconnected "things". It will become natural after some time.
It helps if you read the code with some interest, e.g. if you want to find a bug in an open source project that you have never seen the code for.
When you're debugging issues, read the code for the libraries you're using before going to their documentation. It's a great way to get exposed to other people's code.
I find it useful to open the code in an editor and make running notes in the comments about what I think the state should be. As long as the code has good tests you can use debugging statements to confirm your understanding.
As a bonus you can just send that whole block of code - notes and all - to a colleague if you get stuck. They can read through the code and your thoughts and give feedback.
Like everything else, practice. I like to clone repositories of open source tools I use and try to understand how a particular feature is built end to end. I find that reading code aimlessly is not that helpful. Try to read it with a goal in mind. When starting out, pick a tool/application that is very simple and lean on LLMs to explain only the bits you don't understand.
This is positively false. In my anecdotal experience, I have produced code that was brutal to write, but then it was easy to read afterward.
Just like a piece of music being easy to listen to after sweating out the composition.
Sone coding is like solving the puzzle. Once it is written and debugged, you're looking at the solution. The code will readily spoon-feed you the solution again when you revisit it months or years after forgetting everything, possibly even its existence.
Wen code is easy to write but hard to read, you must be writing fluff. Maybe try to steer your career a bit away from that. But do improve your ability to make fluff readable.
Makes sense. Once you become really good at writing code, it becomes increasingly obvious that the real challenge of software development is the social problem of pointing out subtle contradictions in requirements and suggesting resolutions/trade-offs in a way which earns you respect instead of hatred.
With some stakeholders, this is an almost impossible problem; sometimes this is because they lack vision and so their requirements are littered with impossible contradictions; other times, their ego is too big to accommodate any kind of push-back; even if you try to drip-feed the suggestions as gently as possible, they begin to resent you because they start to associate you with negative feelings such as self-doubt.
Schopenhauer explained this phenomena succinctly:
"A man must be still a greenhorn in the ways of the world, if he imagines that he can make himself popular in society by exhibiting intelligence and discernment. With the immense majority of people, such qualities excite hatred and resentment, which are rendered all the harder to bear by the fact that people are obliged to suppress — even from themselves — the real reason of their anger. What actually takes place is this. A man feels and perceives that the person with whom he is conversing is intellectually very much his superior. He thereupon secretly and half unconsciously concludes that his interlocutor must form a proportionately low and limited estimate of his abilities. That is a method of reasoning — an enthymeme — which rouses the bitterest feelings of sullen and rancorous hatred."
This is a really big problem because people who attain management positions are often very good at understanding and then manipulating what other people think about them; this is how they were able to rise to their current ranks. They are exactly the kinds of people who build these reflective mental maps/models of who thinks what about them; and they are good at plotting against those people who they believe may harbor negative thoughts about them.
I find reading code mentally much more draining than writing it. I admire open source maintainers who mostly handle pull requests from others. This must be very hard. Linux cones to mind here. I assume Thorvalds and the other maintainers don’t get to write much code themselves.
This is the same dissonance as the post about "never write bugs". The premise is wrong to think of code as something you can fit in your brain like you remember the shape of a new building you frequent or a persons face.
You need to both remember the processing done in the code and the structure. The structure will need your working memory in the language center as its n-dimensional and not representable with a closed surface 3D model - at maximum you can do graphs which well dont look like much you will be able to draw from memory.
Remembering state and data needs your whole brain to debug what will happen when you run the code.
All people here telling you they can train this train either an abstract concept like functional programming or work with software in their field which will have a similar scope - you will NOT be able to easily comprehend code written in another style or purpose. Don't be mistaken doing 10 years of mobile apps, ERP or whatever would allow you to follow any C-written code for systems or sth.
Fitting code as an abstraction in your mind is literally neurons growing new pathways - its expensive, no one likes doing it and if you are not in the same field you will try to bend it into your domain or not engage much.
Trying to find general assumptions here will not work - if you could condense n-dim rulesets to some general principles - you would simply do so and refactor the code.
Thinking theres no way to debug abstracted, proveable code you can not fit fully in your working memory is wrong, because you only need to test each function / morphism to be correct, not the whole thing. Its certainly doable to write systems that interconnect 30-40 entities you need NOT remember all at once and still end up with bugfree code because each part is constrained to strict interfaces and types.
That's how rulesets with bigger scopes work - no one designs the aircraft carrier all alone - there's restrictions and interfaces everywhere, not a single engineer knows the whole thing down to the molecules and electrons. Still it works.
Basically when you read code you're mentally doing the same thing as when you write code. But your mental model is more likely to be wrong when you're reading rather than producing the code.
This article makes points that are valid in general but not apposite to code generation with agents.
It is indeed difficult to verify a piece of code that is either going to ship as-is, or with the specific modifications your verification identifies. In cryptography engineering the rule of thumb (never followed in practice) is 10x verification cost to x implementation cost. Verification is hard and expensive.
But qualifying agent-generated code isn't the verification problem, in the same way that validating an eBPF program in the kernel isn't solving the halting problem.
That's because the agentic scenario gives us an additional outcome: we can allow the code as-is, we can make modifications to the code, or we can throw out the code and re-prompt --- discarding (many) probably-valid programs in a search for the subset of programs that are easy to validate.
In practice, most of what people generate is boring and easy to validate: you know within a couple minutes whether it's the right shape, whether anything sticks out the wrong way, the way a veteran chess player can quickly pattern-match a whole chessboard. When it isn't boring, you read carefully (and expensively), or you just say "no, try again, give me something more boring", or you break your LLM generation into smaller steps that are easier to pattern match and recurse.
What professionals generally don't (I think) do with LLMs is generate large gnarly PRs, all at once, and then do a close-reading of those gnarly PRs. They read PRs that are easy (not gnarly). They reject the gnarly ones, and compensate for gnarliness by approaching the problem at a smaller level of granularity. Or, you know, just write those bits by hand!
> In practice, most of what people generate is boring and easy to validate: you know within a couple minutes whether it's the right shape, whether anything sticks out the wrong way, the way a veteran chess player can quickly pattern-match a whole chessboard. When it isn't boring, you read carefully (and expensively), or you just say "no, try again, give me something more boring", or you break your LLM generation into smaller steps that are easier to pattern match and recurse.
Struggling with making an algorithm in a big architecture once I felt the same way - just couldn't keep the current state of the code in my head and also imagine the new solution at the same time. Ever since, have been working on http://etchpad.dev to solve this pain (shameless plug). We're trying to make an interface that works with an engineering brain, so we can think in terms of blueprints and wiring, idk just something more tangible than hallucinating wildly in a multidimensional space like we have to right now.
Really agree with the article, ultimately the typing and thinking speed issues can be solved with AI, but trusting it and auditing what it does seems like a job for humans for the foreseeable, you know, so maybe we avert a classic sci-fi AI apocalypse and whatnot
[+] [-] s_Hogg|6 months ago|reply
Just write simple code
[+] [-] tracker1|6 months ago|reply
I'm also a fan of feature-oriented project structures. I want the unit test file in or next to the code it's testing. For UI projects, similar with React it's about the component or feature not the type of thing. For APIs I will put request handlers with the feature along with models and other abstractions that go together based on what they fulfill, not the type of class they are.
I consider this practice more intuitively discoverable. You go into a directory for "Users" and you will see functionality related to users... this can be profile crud or the endpoint handlers. Security may or may not be a different feature depending on how you grow your app (Users, Roles, Permissions, etc). For that matter, I'd more often rather curate a single app that does what it needs vs. dozens of apps in a singular larger project. I've seen .Net web projects strewn across 60+ applications in two different solutions before. It took literally weeks to do what should take half a day at most.
All for one website/app to get published. WHY?!? I'm not opposed to smaller/micro services where they make sense either. But keep it all as simple as you possibly can. Try to make what you create/use/consume/produce as simple as you can too. Can you easily use/consume/interact with what you make from a system in $NewLanguage without too much headache? I don't like to have to rely on special libraries being available everywhere.
[+] [-] flykespice|6 months ago|reply
When I was a kid learning programming, I would skim through the whole book teaching Python and type the code using as much keywords as I learned each day, just to boast on my parents and my non-programmers peers about the obfuscated mess that came after.
As I grew I started to contribute to other open-source projects and I came across every kind of unmaitanable spaghetti code, so that I just gave up contribuiting on said project, that is when I gained the consciousness about being zealous on keeping the code as simple as possible so that the next person who comes after me to change the code don't have as much trouble understanding the code, even myself when I revisit the code later.
That altruistic mindset about caring how others read your code, you don't acquire easily unless you get experience how your previous peers did feel.
[+] [-] vjvjvjvjghv|6 months ago|reply
I guess it’s best to take a look at the code once something works and then see if it can be simplified. A lot of people seem to skip that step.
[+] [-] thomasikzelf|6 months ago|reply
I guess this means that one should solve the appropriate problems for a given skill level
[+] [-] rottc0dd|6 months ago|reply
[0] - https://www.laws-of-software.com/laws/kernighan/
[+] [-] Ferret7446|6 months ago|reply
[+] [-] lukan|6 months ago|reply
Not sure if there is actually data for the assumption to be twice as smart to debug than to write code, but it sounds about right.
And also, if you write the smartest code you can, while you are at your peak, you also won't be able to read it, when you are just a bit tired.
So yes, yes, yes. Just write simple code.
(I also was also initially messed up a bit by teachers filling me up with the idea to aim for clever code)
[+] [-] EGreg|6 months ago|reply
[+] [-] draw_down|6 months ago|reply
[deleted]
[+] [-] HuwFulcher|6 months ago|reply
Even if I use Cursor (or some other equivalent) and review the code I find my mental model of the system is much more lacking. It actually had a net negative on my productivity as it gave me anxiety at going back to the codebase.
If an AI tool could help a user interactively learn the mental model I think that would be a great step in the right direction.
[+] [-] ivape|6 months ago|reply
And that's probably the difference between those who are okay with vibe coding and those who aren't. A leader of a company that doesn't care about code quality (elegant code, good tradeoffs, etc) would never have cared if 10 monkeys outputted the code pre-AI or if 10 robot monkeys outputted the code with AI. It's only a developer, of a certain type, that would care to say "pause" in either of those situations.
Out of principal I would not share or build coding tools for these people. They literally did not care all these years about code quality, and the last thing I want to do is enable them on any level.
[+] [-] catigula|6 months ago|reply
I've contracted some of this understanding of pieces/intellectual work out to Claude code many, many times successfully.
[+] [-] ppeetteerr|6 months ago|reply
With LLMs, the volume of code has only gotten larger but those same LLMs can help review the code being written. The current code review agents are surprisingly good at catching errors. Better than most reviewers.
We'll soon get to a point where it's no longer necessary to review code, either by the LLM prompter, or by a second reviewer (the volume of generate code will be too great). Instead, we'll need to create new tools and guardrails to ensure that whatever is written is done in a sustainable way.
[+] [-] gyomu|6 months ago|reply
The real breakthrough would be finding a way to not even do things that don’t need to be done in the first place.
90% of what management thinks it wants gets discarded/completely upended a few days/weeks/months later anyway, so we should have AI agents that just say “nah, actually you won’t need that” to 90% of our requests.
[+] [-] Bukhmanizer|6 months ago|reply
This seems silly to me. In most cases, the least amount of work you can possibly do is logically describe the process you want and the boundaries, and run that logic over the input data. In other words, coding.
The idea that we should, to avoid coding or reading code, come up with a whole new process to keep generated code on track - would almost certainly take more effort than just getting the logical incantations correct the first time.
[+] [-] foxfired|6 months ago|reply
[+] [-] vjvjvjvjghv|6 months ago|reply
It’s even worse with offshore devs. They produce a ton of code you have to review every morning.
[+] [-] ottaborra|6 months ago|reply
And the best part, most people shallow read all of them and decide the details are needless till they are forced to deal with the details and then their understanding falls apart in front of them
[+] [-] _feus|6 months ago|reply
[+] [-] GuB-42|6 months ago|reply
High quality code is generally hard to write and easy to read.
[+] [-] dingnuts|6 months ago|reply
LLM code is NOT like this at all, but it's like a skilled liar writing something that LOOKS plausible, that's what they're trained to do.
People like you do not have the ability to evaluate the LLM output; it's not the same as reading code that was carefully written at ALL. If you think it's the same, that is only evidence that you can't tell the difference between working code and misleading buggy code.
What you've learned to do is read the intent of code. That's fine when it's been written and tested by a person. It's useless when it comes to evaluating LLM slop.
You're being gaslit.
[+] [-] larntz|6 months ago|reply
https://www.manning.com/books/the-programmers-brain
[+] [-] vivzkestrel|6 months ago|reply
[+] [-] Night_Thastus|6 months ago|reply
#1 is easy, #2 requires some investigation, #3 requires studying.
If you're looking at say, banking code - but you know nothing about finance - you may struggle to understand what it's doing. You may want to gain some domain expertise. Being an SME makes reading the related code a heck of a lot easier.
Context comes down to learning the code base. What code calls the part you're looking at? What user actions trigger it? Look at the comments and commit messages - what was the intention? This just takes time and a lot of trawling around, looking for patterns and common elements. User manuals and documentation can also help. This part can't be rushed - it just comes to passing over it again and again and again. If you have access to people very familiar with the code - ask them! They may be able to kick start your intro.
#1 will come naturally with time.
[+] [-] lukaslalinsky|6 months ago|reply
When looking at a piece of code, keep asking questions like: what does this return, what are the side effects, what can go wrong, what happens if this goes wrong, where do we exit, can this get stuck, where do we close/save/commit this, what's the input, what if the input is wrong/missing, where are we checking if the input is OK, can this number underflow/overflow, etc
All these questions are there to complete the picture, so that instead of function calls and loops, you are looking at the graph of interconnected "things". It will become natural after some time.
It helps if you read the code with some interest, e.g. if you want to find a bug in an open source project that you have never seen the code for.
[+] [-] ivanjermakov|6 months ago|reply
Code navigation should be instant and effortless. Get good tooling and train muscle memory for it.
[+] [-] gerad|6 months ago|reply
[+] [-] aeturnum|6 months ago|reply
As a bonus you can just send that whole block of code - notes and all - to a colleague if you get stuck. They can read through the code and your thoughts and give feedback.
[+] [-] danielmarkbruce|6 months ago|reply
It's not a joke answer. This entire article is silly. LLMs are great for helping you understanding code.
[+] [-] __alias|6 months ago|reply
To get a good mental model, I'll often get an LLM to generate a few mermaid diagrams to help create a mental model of how everything pieces together
[+] [-] hashbig|6 months ago|reply
[+] [-] maxverse|6 months ago|reply
[+] [-] kazinator|6 months ago|reply
Just like a piece of music being easy to listen to after sweating out the composition.
Sone coding is like solving the puzzle. Once it is written and debugged, you're looking at the solution. The code will readily spoon-feed you the solution again when you revisit it months or years after forgetting everything, possibly even its existence.
Wen code is easy to write but hard to read, you must be writing fluff. Maybe try to steer your career a bit away from that. But do improve your ability to make fluff readable.
[+] [-] jongjong|6 months ago|reply
With some stakeholders, this is an almost impossible problem; sometimes this is because they lack vision and so their requirements are littered with impossible contradictions; other times, their ego is too big to accommodate any kind of push-back; even if you try to drip-feed the suggestions as gently as possible, they begin to resent you because they start to associate you with negative feelings such as self-doubt.
Schopenhauer explained this phenomena succinctly:
"A man must be still a greenhorn in the ways of the world, if he imagines that he can make himself popular in society by exhibiting intelligence and discernment. With the immense majority of people, such qualities excite hatred and resentment, which are rendered all the harder to bear by the fact that people are obliged to suppress — even from themselves — the real reason of their anger. What actually takes place is this. A man feels and perceives that the person with whom he is conversing is intellectually very much his superior. He thereupon secretly and half unconsciously concludes that his interlocutor must form a proportionately low and limited estimate of his abilities. That is a method of reasoning — an enthymeme — which rouses the bitterest feelings of sullen and rancorous hatred."
This is a really big problem because people who attain management positions are often very good at understanding and then manipulating what other people think about them; this is how they were able to rise to their current ranks. They are exactly the kinds of people who build these reflective mental maps/models of who thinks what about them; and they are good at plotting against those people who they believe may harbor negative thoughts about them.
[+] [-] vjvjvjvjghv|6 months ago|reply
[+] [-] unknown|6 months ago|reply
[deleted]
[+] [-] Krei-se|6 months ago|reply
You need to both remember the processing done in the code and the structure. The structure will need your working memory in the language center as its n-dimensional and not representable with a closed surface 3D model - at maximum you can do graphs which well dont look like much you will be able to draw from memory.
Remembering state and data needs your whole brain to debug what will happen when you run the code.
All people here telling you they can train this train either an abstract concept like functional programming or work with software in their field which will have a similar scope - you will NOT be able to easily comprehend code written in another style or purpose. Don't be mistaken doing 10 years of mobile apps, ERP or whatever would allow you to follow any C-written code for systems or sth.
Fitting code as an abstraction in your mind is literally neurons growing new pathways - its expensive, no one likes doing it and if you are not in the same field you will try to bend it into your domain or not engage much.
Trying to find general assumptions here will not work - if you could condense n-dim rulesets to some general principles - you would simply do so and refactor the code.
Thinking theres no way to debug abstracted, proveable code you can not fit fully in your working memory is wrong, because you only need to test each function / morphism to be correct, not the whole thing. Its certainly doable to write systems that interconnect 30-40 entities you need NOT remember all at once and still end up with bugfree code because each part is constrained to strict interfaces and types.
That's how rulesets with bigger scopes work - no one designs the aircraft carrier all alone - there's restrictions and interfaces everywhere, not a single engineer knows the whole thing down to the molecules and electrons. Still it works.
[+] [-] bcrosby95|6 months ago|reply
[+] [-] jameskilton|6 months ago|reply
[+] [-] ivanjermakov|6 months ago|reply
[+] [-] tptacek|6 months ago|reply
It is indeed difficult to verify a piece of code that is either going to ship as-is, or with the specific modifications your verification identifies. In cryptography engineering the rule of thumb (never followed in practice) is 10x verification cost to x implementation cost. Verification is hard and expensive.
But qualifying agent-generated code isn't the verification problem, in the same way that validating an eBPF program in the kernel isn't solving the halting problem.
That's because the agentic scenario gives us an additional outcome: we can allow the code as-is, we can make modifications to the code, or we can throw out the code and re-prompt --- discarding (many) probably-valid programs in a search for the subset of programs that are easy to validate.
In practice, most of what people generate is boring and easy to validate: you know within a couple minutes whether it's the right shape, whether anything sticks out the wrong way, the way a veteran chess player can quickly pattern-match a whole chessboard. When it isn't boring, you read carefully (and expensively), or you just say "no, try again, give me something more boring", or you break your LLM generation into smaller steps that are easier to pattern match and recurse.
What professionals generally don't (I think) do with LLMs is generate large gnarly PRs, all at once, and then do a close-reading of those gnarly PRs. They read PRs that are easy (not gnarly). They reject the gnarly ones, and compensate for gnarliness by approaching the problem at a smaller level of granularity. Or, you know, just write those bits by hand!
[+] [-] kiitos|6 months ago|reply
nope!
https://sketch.dev/blog/our-first-outage-from-llm-written-co...
[+] [-] m3kw9|6 months ago|reply
[+] [-] flykespice|6 months ago|reply
[+] [-] jmsfltchruk|6 months ago|reply
Really agree with the article, ultimately the typing and thinking speed issues can be solved with AI, but trusting it and auditing what it does seems like a job for humans for the foreseeable, you know, so maybe we avert a classic sci-fi AI apocalypse and whatnot
[+] [-] duckydude20|6 months ago|reply
[+] [-] mintaka5|6 months ago|reply
[deleted]