top | item 47195487

(no title)

> OpenClaw has nearly half a million lines of code, 53 config files, and over 70 dependencies. This breaks the basic premise of open source security. Chromium has 35+ million lines, but you trust Google’s review processes. Most open source projects work the other way: they stay small enough that many eyes can actually review them. Nobody has reviewed OpenClaw’s 400,000 lines.

This reminds me of a very common thing posted here (and elsewhere, e.g. Twitter) to promote how good LLMs are and how they're going to take over programming: the number of lines of code they produce.

As if every competent programmer suddenly forgot the whole idea of LoC being a terrible metric to measure productivity or -even worse- software quality. Or the idea that software is meant to written to be readable (to water down "Programs are meant to be read by humans and only incidentally for computers to execute" a bit). Or even Bill Gates' infamous "Measuring programming progress by lines of code is like measuring aircraft building progress by weight".

Even if you believe that AI will -somehow- take over the whole task completely so that no human will need to read code anymore, there is still the issue that the AIs will need to be able to read that code and AIs are much worse at doing that (especially with their limited context sizes) than generating code, so it still remains a problem to use LoCs as such a measure even if all you care are about the driest "does X do the thing i want?" aspect, ignoring other quality concerns.

discuss

gyomu|2 days ago

Yeah, it’s pretty wild. Even pg is tweeting stuff like

“An experienced programmer told me he's now using AI to generate a thousand lines of code an hour.“

https://x.com/paulg/status/2026739899936944495

Like if you had told pg to his face in (pre AI) office hours “I’m producing a thousand lines of code an hour”, I’m pretty sure he’d have laughed and pointed out how pointless that metric was?

ruszki|2 days ago

I don't understand how some people decide here, who the good programmers are. A lot of people reminded me a guy from West Palm Beach, who votes on elections solely on the principle of who has more "fame". Paul Graham is famous for sure (at least in HN circles), but I never considered him an exceptional or good programmer at all. So I always interpreted his words with a hefty amount of grain of salt. And sometimes some comments have a list of "good" coders, then half of them is like these famous, but not good ones.

medi8r|2 days ago

He is a Lisper too, making it more ironic. Lisp the power to heavily reduce cruft by heavy customization with macros.

amelius|2 days ago

Technical debt is increasing by 1,000 lines an hour.

manoDev|2 days ago

They need to keep the musical chairs going.

lukan|2 days ago

Hm, I do not read the statement as a hyped "this is how everyone should write code now" rather as a statement of fact. "A experienced programmer he knows uses LLMs to generate thounds LOC/h". That does not say whether those lines will actually be shipped anywhere or just exist for testing purposes/prototyping.

steve1977|2 days ago

We all know that a thousand parentheses would be better metric.

ElProlactin|2 days ago

Enshittification comes for us all

wiseowise|2 days ago

It’s all virtual virtue signaling. If you were to say this shit in the office, you’d be walked out pretty fast.

supriyo-biswas|2 days ago

Somehow, this narrative has taken hold at multiple levels of management, especially amongst non-technical management, that "typing" was somehow the bottleneck of software engineering, reality is however more complex.

The act of "typing" code was technically mixed in with researching solutions, which means that code often took a different shape or design based on the outcome of that activity. However, this nuance has been typically ignored for faff, with the outcome that management thinks that producing X lines of code can be done "quickly", and people disagreeing with said statements are heretics who should be burned at the stake.

This is why, in my personal opinion, AI makes me only 20% productive, I often find disagreeing with the solution that it came up with and instead of having to steer it to obtain the outcome I want, I just end up rewriting the code myself. On the other hand, for prototypes where I don't care about understanding the code at all, it is more of a bigger time saver.

I could not care about the code at all, and while that is acceptable to management, not being responsible for the code but being responsible for the outcomes seems to be the same shit as being given responsibilities without autonomy, which is not something I can agree with.

jorvi|2 days ago

AI is good at the first 80% but terrible at the last 20% of producing good code. And you need to through that first 80% to really understand what the code is scaffolded to do, which writing it yourself will vastly improve. And typing speed has never been the bottleneck for coding.

Even worse, whole generation of devs are being trained to not care of learn about that last 20% because the AI does it """all""" for them. That last bit is an unknown unknown for the neo developer nee prompter.

hirako2000|2 days ago

More people believe a software developer job and value is in the lines of code produced.

Perhaps over half of engineering managers unconsciously or admittedly take the amount of PR and code additions as a rough but valid measure of productivity.

I recall a role in architecture, senior director asking me how come a principal engineer didn't commit any code in 2 weeks, that we pay principals a fortune.

I asked that brilliant mind whether we paid principal engineers to code or to make sure we deliver value.

Needless to say the with question went unanswered, so called Principal was fired a few months later. The entire company in fact was sold for a bargain too given it had thousands of clients globally.

The LLM can replace engineers is a phenomenon that converge from two simple facts, we haven't solved the misconception of the engineering roles. And it's the perfect scapegoat to justify layoffs.

Leaders haven't all gone insane, they answer to difficult questions with the narrative of least resistance.

andrei_says_|2 days ago

> Leaders haven't all gone insane, they answer to difficult questions with the narrative of least resistance.

Brilliantly said. I’d like to add - a distorted narrative actively, intentionally established and maintained by the entities profiting from the technology. Quite similar to the crypto scam hype cycle.

MadxX79|2 days ago

Brook's law anno 2026:

"Adding manpower to a late software project makes it later -- unless that manpower is AI, then you're golden!"

steveklabnik|2 days ago

I know you're being sarcastic, but this is what OpenAI has said:

https://openai.com/index/harness-engineering/

> This translates to an average throughput of 3.5 PRs per engineer per day, and surprisingly the throughput has increased as the team has grown to now seven engineers.

We will see if this continues to scale up!

smikhanov|2 days ago

That law (formulated in the 70s, I’ll remind the reader) wasn’t true for at least couple decades now.

tdeck|2 days ago

I asked Grok to rewrite your comment and it did it in 2400 words. I hope you know you'll be obsolete soon.

KronisLV|2 days ago

As lines of code become executable line noise, I swear that we need better approaches to developing software - either enforce better test coverage across the board, develop and use languages where it’s exceedingly hard to end up with improper states, or sandbox the frick out of runtimes and permissions.

Just as an example, I should easily be able to give each program an allowlist of network endpoints they’re allowed to use for inbound and outgoing traffic and sandbox them to specific directories and control resource access EASILY. Docker at least gets some of those right, but most desktop OSes feel like the Wild West even when compared to the permissions model of iOS.

bee_rider|2 days ago

“LoC is a bad metric” has been the catchphrase of engineers for years, because it runs counter to the expectations of management and the general public, right? So it makes sense that LoC is the metric used to advertise to them.

sd9|2 days ago

LLMs are incredibly eager to write new code, rather than modifying or integrating with existing systems. I agree that context windows are too small currently for this to seem sustainable. Without reasonable architecture pure vibe coded software feels like it’s going to cap out at a certain size.

andai|2 days ago

>Nobody has reviewed OpenClaw’s 400,000 lines.

Including the author, who brags he doesn't read his own code. Indeed, it would be physically impossible for him to do so!

https://steipete.me/posts/2025/shipping-at-inference-speed

As mentioned elsewhere in the thread, there is very clearly an obsession with quantity over quality. Not a new phenomenon by any means: people were already complaining about this in the 19th century! But it has reached a new absurd height with this latest trend.

K0balt|2 days ago

It’s definitely an issue when using coding assistants.

If you are careful and specific you can keep things reasonable, but even when I am careful and do consolidattion / factoring passes, have rigid separation of concerns, etc I find that the LLM code is bigger than mine, mainly for two reasons:

1) more extensive inline documentation 2) more complete expression of the APIs across concerns, as well as stricter separation.

2.5 often, also a bit of demonstrative structure that could be more concise but exists in a less compact form to demonstrate it’s purpose and function (high degree of cleverness avoidance)

All in all, if you don’t just let it run amok, you can end up with better code and increased productivity in the same stroke, but I find it comes at about a 15% plumpness penalty, offset by readability and obvious functionality.

Oh, forgot to mention, I always make it clean room most of the code it might want to pull in from libraries, except extremely core standard libraries, or for the really heavy stuff like Bluetooth / WiFi protocol stacks etc.

I find a lot of library type code ends up withering away with successive cleanup passes, because it wasn’t really necessary just cognitively easier to implement a prototype. With refinement, the functionality ends up burrowing in, often becoming part of the data structure where it really belonged in the first place.

CuriouslyC|2 days ago

The lines of code thing isn't because we think it's a good metric, but because we have literally no good metric and we're trying to communicate a velocity difference. If you invent a new metric that doesn't have LoC's problems while being as easy to use, you'll be a household name in software engineering in short order.

Also, AI is better at reading code than writing it, but the overhead to FIND code is real.

unknown|2 days ago

[deleted]

samiv|2 days ago

That's because they're an additive tool. Everything boils down to "adding" more code. But in the long term its not about how much code you can add but how little you can get away with. But this is an impossible task for the LLMs. How would you train one not to write code? What would the training data look like? Would that be all the lines of code that haven't been written?

skeledrew|2 days ago

TDD would help here, particularly if a human writes - or at least thoroughly reviews - the tests.

https://martinfowler.com/bliki/TestDrivenDevelopment.html

tartoran|2 days ago

That’s not an impossible task with LLMs, you just have to mindfully architect the project with that in mind, hence take it slowly to design a good system, don’t outsource all thinking to LLMs.

simgt|2 days ago

Well they will train on my Claude Code sessions for a start. I spend a lot of time asking it to remove unnecessary code that was produced, I'm not the only one.

danjc|2 days ago

I've been waiting for someone to say this. An agent will generally produce far more code than technically necessary for the task. It's a kind of over engineering which makes it increasingly harder to wrap your head around the codebase.

truthbe|2 days ago

Over engineered implies the codebase was inflated with some kind of rationale by the AI, but there is none. It's just code vomit with duct tape

wredcoll|2 days ago

Really it just continues to demonstrate that "code quality" is not and was not a requirement.

Even with supposedly expert human hand written software powering our products for the last decades, they frequently crash, have outages, and show all sorts of smaller bugs.

There are literally too many examples to count of video games being released with nigh-unplayable amounts of bugs and still selling millions and producing sequels.

Windows 95 and friends were famously buggy and crash prone yet produced one of the most valuable companies in the world.

ninkendo|2 days ago

Respectfully, it feels like your position requires a very low, if not brain-dead level of incompetence on the part of LLM users, in order for your conclusion to be correct.

My personal anecdote: I used an LLM recently to basically vibe code a password manager.

Now, I’ve been a software engineer for 20 years. I’m very familiar with the process of code review and how to dive in to someone else’s code and get a feel for what’s happening, and how to spot issues. So when I say the LLM produced thousands of lines of working code in a very short time (probably at least 10 times faster than I would have done it), you could easily point at me and say “ha, look at ninkendo, he thinks more lines of code equals better!” And walk away feeling smug. Like, in your mind perhaps you think the result is an unmaintainable mess, and that the only thing I’m gushing about is the LOC count.

But here’s the thing: it actually did a good job. I was personally reviewing the code the whole time. And believe me when I say, the resulting product is actually good. The code is readable and obvious, it put clean separation of responsibilities into different crates (I’m using rust) and it wrote tons of tests, which actually validate behavior. It’s very near the quality level of what I would have been able to do. And I’m not half bad. (I’ve been coding in rust in particular, professionally for about 2 years now, on top of the ~20 years of other professional programming experience before that.)

My takeaway is that as a professional engineer, my job is going to be shifting from doing the actual code writing, to managing an LLM as if it’s my pair programming partner and it has the keyboard. I feel sad for the loss of the actual practice of coding, but it’s all over but the mourning at this point. This tech is here to stay.

FEELmyAGI|2 days ago

This whole reply, and every other "anecdote" reply is more worthless than the pixels its printed on, without a link to your "actually did a good job" password manager.

(wow funny how these vibe code apps always are copies of something theres many open source versions of already)

bee_rider|2 days ago

If you measure the productivity of the system that is “you, using an LLM” in terms of the rate at which you can get actually-reviewed code completed (which, based on your comment, seems to be what you were doing) that seems like a totally reasonable way of doing things. But in that case the bottleneck is probably you reviewing code, right? Which, I bet, is faster than writing code. But you probably won’t get the truly absurd superhuman speed ups.

What would you say is your multiplier, in terms of throughly reviewing code vs writing it from scratch?

badsectoracula|2 days ago

I don't know if it is incompetence - if anything i doubt it, someone else pointed out that pg also used that metric and i don't think pg is incompetent. However at the same time i think it is misleading at best.

My impression is that, as someone else wrote, we do not have an actual metric for such things as productivity or quality or what have you, but some people do want to communicate that they feel (regardless of if that matches reality) using an LLM is better/faster/easier and they latch to the (wrong) assumption about more LoC == better/faster that non-programmers already believed for years (intentionally or not, they may also deluding themselves) as that is an easy path to convince them that the new toys have value that applies to the non-programmers too (note that i explicitly ignore the perspective of the "toymakers" as those have further incentives to promote their products).

Personally i also have about 2 decades of professional experience (more if counting non-professional) and i've been toying with LLMs now and then. I do find them interesting and when i use them for coding tasks, i absolutely find useful cases for them, i like to have them (where possible) write all sorts of code that i could write myself but i just don't feel like doing so and i do find them useful for stuff i'm not particularly interested in exploring but want to have anyway (usually Python stuff) and i'm sure i'll find more uses for them in the future. Depending on the case and specifics i may even say that in very particular situations i can do things faster using LLMs (though it is not a given and personally that is not much of a requirement nor something i have anywhere high in my interest when it comes to using LLMs - i'd rather have them produce better code slower, than dummy/pointless/repetitive code faster).

However one thing i never thought about was how "great" it is that they generate a lot of lines of code per whatever time interval. If anything i'd prefer it if they generated less line of code and i'd consider an LLM (or any other AI-ish system) "smarter" if they could figure out how to do that without needing hand holding from me. Because of this, i just can't see LoCs as anything but a very bad metric - which is the same as when the code is written by humans.

halnine9000|2 days ago

>this tech is here to stay

How can you say that when all these models are externally sourced by companies that actively make a loss per token? When they finally need to make a profit, how can we be sure these models as well as their owners will remain as reliable and not enshittified? Anthropic has been blacklisted in the last 24 hours so its a turbulent industry to say the least

inciampati|2 days ago

Lines of code are nothing. It's verification that creates value.

theptip|2 days ago

Yeah, I would view this as a “levels of maturity” thing. It’s not completely misguided to judge a JD on whether they shipped 0loc or 1kloc. Assuming you have some quality counter-metric like “the app works”.

For staff engineers it’s obviously completely nonsense, many don’t code and just ship architecture docs. Or you can ship a net negative refactor. Etc.

So this should tell you that LLMs are still in “savant JD” territory.

That said, being given permission to ship more lines of code under existing enterprise quality bars _is_ a meaningful signal.

spacecadet|2 days ago

I mean many of us have... I operate in a net negative mindset. My PRs, better remove more than they add.

I also use AI this way, periodically achieving a net negative refactor.