top | item 45577441

(no title)

joshvince | 4 months ago

> if you’re working on novel code, LLMs are absolutely horrible

This is spot on. Current state-of-the-art models are, in my experience, very good at writing boilerplate code or very simple architecture especially in projects or frameworks where there are extremely well-known opinionated patterns (MVC especially).

What they are genuinely impressive at is parsing through large amounts of information to find something (eg: in a codebase, or in stack traces, or in logs). But this hype machine of 'agents creating entire codebases' is surely just smoke and mirrors - at least for now.

discuss

ehnto|4 months ago

> at least for now.

I know I could be eating my words, but there is basically no evidence to suggest it ever becomes as exceptional as the kingmakers are hoping.

Yes it advanced extremely quickly, but that is not a confirmation of anything. It could just be the technology quickly meeting us at either our limit of compute, or it's limit of capability.

My thinking here is that we already had the technologies of the LLMs and the compute, but we hadn't yet had the reason and capital to deploy it at this scale.

So the surprising innovation of transformers did not give us the boost in capability itself, it still needed scale. The marketing that enabled the capital, that enables that scale was what caused the insane growth, and capital can't grow forever, it needs returns.

Scale has been exponential, and we are hitting an insane amount of capital deployment for this one technology that, has yet to prove commercially viable at the scale of a paradigm shift.

Are businesses that are not AI based, actually seeing ROI on AI spend? That is really the only question that matters, because if that is false, the money and drive for the technology vanishes and the scale that enables it disappears too.

delusional|4 months ago

> Yes it advanced extremely quickly, but that is not a confirmation of anything. It could just be the technology quickly meeting us at either our limit of compute, or it's limit of capability.

To comment om this, because its the most common counter argument. Most technology has worked in steps. We take a step forward, then iterate on essentially the same thing. It's very rare we see order of magnitude improvement on the same fundamental "step".

Cars were quite a step forward from donkeys, but modern cars are not that far off from the first ones. Planes were an amazing invention, but the next model of plane is basically the same thing as the first one.

bccdee|4 months ago

> Yes it advanced extremely quickly

The things that impress me about gpt-5 are basically the same ones that impressed me about gpt-3. For all the talk about exponential growth, I feel like we experienced one big technical leap forward and have spent the past 5 years fine-tuning the result—as if fiddling with it long enough will turn it into something it is not.

wkat4242|4 months ago

> Yes it advanced extremely quickly,

It did but it's kinda stagnated now especially on the LLM front. The time when ever week a groundbreaking model came out is over for now. Later revisions of existing models, like GPT5 and llama4 have been underwhelming.

ludicrousdispla|4 months ago

>> The marketing that enabled the capital, that enables that scale was what caused the insane growth, and capital can't grow forever,

Striking parallels between AI and food delivery (uber eats, deliveroo, lieferando, etc.) ... burn capital for market share/penetration but only deliver someone else's product with no investment to understand the core market for the purpose of developing a better product.

NitpickLawyer|4 months ago

> I know I could be eating my words, but there is basically no evidence to suggest it ever becomes as exceptional as the kingmakers are hoping.

??? It has already become exceptional. In 2.5 years (since chatgpt launched) we went from "oh, look how cute this is, it writes poems and the code almost looks like python" to "hey, this thing basically wrote a full programming language[1] with genz keywords, and it mostly works, still has some bugs".

I think the goalpost moving is at play here, and we quickly forget how 1 year makes a huge difference (last year you needed tons of glue and handwritten harnesses to do anything - see aider) and today you can give them a spec and get a mostly working project (albeit with some bugs), 50$ later.

[1] - https://github.com/ghuntley/cursed

walleeee|4 months ago

The question in your last paragraph is not the only one that matters. Funding the technology at a material loss will not be off the table. Think about why.

wongarsu|4 months ago

I have had LLMs write entire codebases for me, so it's not like the hype is completely wrong. It's just that this only works if what you want is "boring", limited in scope and on a well-trodden path. You can have an LLM create a CRUD application in one go, or if you want to sort training data for image recognition you can have it generte a one-off image viewer with shortcuts tailored to your needs for this task. Those are powerful things and worthy of some hype. For anything more complex you very quickly run into limits and the time and effort to do it with an LLM quickly approaches the time and effort required to do it by hand.

physicsguy|4 months ago

They're powerful, but my feeling is that largely you could do this pre-LLM by searching on Stack Overflow or copying and pasting from the browser and adapting those examples, if you knew what you were looking for. Where it adds power is adapting it to your particular use case + putting it in the IDE. It's a big leap but not as enormous a leap as some people are making out.

Of course, if you don't know what you are looking for, it can make that process much easier. I think this is why people at the junior end find it is making them (a claimed) 10x more productive. But people who have been around for a long time are more skeptical.

alwahi|4 months ago

i have seen so many people say that, but the app stores/package managers aren't being flooded with thousands of vibe coded apps, meanwhile facebook is basically ai slop. can you share your github? or a gist of some of these "codebases"

0xAFFFF|4 months ago

> Current state-of-the-art models are, in my experience, very good at writing boilerplate code or very simple architecture especially in projects or frameworks where there are extremely well-known opinionated patterns (MVC especially).

Which makes sense, considering the absolutely massive amount of tutorials and basic HOWTOs that were present in the training data, as they are the easiest kind of programming content to produce.

amiga386|4 months ago

The purpose of an LLM is not to do your job, it's to do enough to convince your boss to sack you and pay the LLM company some portion of your salary.

To that end, it doesn't matter if it works or not, it just has to demo well.

motorest|4 months ago

Yes, kind of. What you downplay as "extremely well-known opinionated patterns" actually means standard design patterns that are well established and tried-and-true. You know, what competent engineers do.

There's even a basic technique which consists of prompting agents to refactor code to clean it up to comply with best practices, as this helps agents evaluate your project as it lines them up with known patterns.

> What they are genuinely impressive at is parsing through large amounts of information to find something (eg: in a codebase, or in stack traces, or in logs).

Yes, they are. It helps if a project is well structured, clean, and follow best practices. Messy projects that are inconsistent and evolve as big balls of mud can and do judge LLMs to output garbage based on the garbage that was inputted. Once, while working on a particularly bad project, I noticed GPT4.1 wasn't even managing to put together consistent variable names for domain models.

> But this hype machine of 'agents creating entire codebases' is surely just smoke and mirrors - at least for now.

This really depends on what are your expectations. A glass half full perspective clearly points you to the fact that yes agents can and do create entire codebases. I know this to be a fact because I did it already just for shits and giggles. A glass half empty perspective however will lead people to nitpick their way into asserting agents are useless at creating code because they once prompted something to create a Twitter code and it failed to set the right shade of blue. YMMV and what you get out is proportional to the effort you put in.

vallavaraiyan|4 months ago

What is novel code?

  1. LLM's would suck at coming up with new algorithms. 
  2. I wouldn't let an LLM decide how to structure my code. Interfaces, module boundaries etc

Other than that, given the right context (the sdk doc for a unique hardware for eg) and a well organised codebase explained using CLAUDE.Md they work pretty well in filling out implementations. Just need to resist the temptation to prompt while the actual typing would take seconds.

IX-103|4 months ago

Yep, LLMs are basically at the "really smart intern" level. Give them anything complex or that requires experience and they crash and burn. Give them a small, well-specified task with limited scope and they do reasonably well. And like an intern they require constant check-ins to make sure they're on track.

Of course with real interns you end up at the end with trained developers ready for more complicated tasks. This is useful because interns aren't really that productive if you consider the amount of time they take from experienced developers, so the main benefit is producing skilled employees. But LLMs will always be interns, since they don't grow with the experience.