(no title)
anxoo | 9 months ago
otherwise, yes, you'll continue to be irritated by AI hype, maybe up until the point where our civilization starts going off the rails
anxoo | 9 months ago
otherwise, yes, you'll continue to be irritated by AI hype, maybe up until the point where our civilization starts going off the rails
TheRoque|9 months ago
- they can't be aware of the latest changes in the frameworks I use, and so force me to use older features, sometimes less efficient
- they fail at doing clean DRY practices even though they are supposed to skim through the codebase much faster than me
- they bait me into inexisting apis, or hallucinate solutions or issues
- they cannot properly pick the context and the files to read in a mid-size app
- they suggest to download some random packages, sometimes low quality ones, or unmaintained ones
simonw|9 months ago
That's mostly solved by the most recent ones that can run searches. I've had great results from o4-mini for this, since it can search for the latest updates - example here: https://simonwillison.net/2025/Apr/21/ai-assisted-search/#la...
Or for a lot of libraries you can dump the ENTIRE latest version into the prompt - I do this a lot with the Google Gemini 2.5 models since those can handle up to 1m tokens of input.
"they fail at doing clean DRY practices" - tell them to DRY in your prompt.
"they bait me into inexisting apis, or hallucinate solutions or issues" - really not an issue if you're actually testing your code! I wrote about that one here: https://simonwillison.net/2025/Mar/2/hallucinations-in-code/ - and if you're using one of the systems that runs your code for you (as promoted in tptacek's post) it will spot and fix these without you even needing to intervene.
"they cannot properly pick the context and the files to read in a mid-size app" - try Claude Code. It has a whole mechanism dedicated to doing just that, I reverse-engineered it this morning: https://simonwillison.net/2025/Jun/2/claude-trace/
"they suggest to download some random packages, sometimes low quality ones, or unmaintained ones" - yes, they absolutely do that. You need to maintain editorial control over what dependencies you add.
travisjungroth|9 months ago
agotterer|9 months ago
This is where collaboration comes in play. If you solely rely on the LLM to “vibe code” everything, then you’re right, you get whatever it thinks is best at the time of generation. That could be wrong or outdated.
My workflow is to first provide clear requirements, generally one objective at a time. Sometimes I use an llm to format the requirements for the llm to generate code from. It then writes some code, and I review it. If I notice something is outdated I give it a link to the docs and tell it to update it using X. A few seconds later it’s made the change. I did this just yesterday when building out an integration with an api. Claude wrote the code using a batch endpoint because the steaming endpoint was just released and I don’t think it was aware of it. My role in this collaboration, is to be aware of what’s possible and how I want it to work (e.g.. being aware of the latest features and updates of the frameworks and libraries). Then it’s just about prompting and directing the llm until it works the way I want. When it’s really not working, then I jump in.
bdangubic|9 months ago
of course they can, teach them / feed them latest changes or whatever you need (much like another developer unaware of the same thing)
they fail at doing clean DRY practices even though they are supposed to skim through the codebase much faster than me
tell them it is not DRY until they make it DRY. for some (several projects I’ve been involved with) DRY is generally anti-pattern when taken to extremes (abstraction gone awry etc…). instruct it what you expect and it and watch it deliver (much like you would another developer…)
they bait me into inexisting apis, or hallucinate solutions or issues
tell it when it hallucinates, it’ll correct itself
they cannot properly pick the context and the files to read in a mid-size app
provide it with context (you should always do this anyways)
they suggest to download some random packages, sometimes low quality ones, or unmaintained ones
tell it about it, it will correct itself
apwell23|9 months ago
yes. this happens to me almost every time i use it. I feel like a crazy person reading all the AI hype.
motza|9 months ago
bradfa|9 months ago
alisonatwork|9 months ago
All of the state-of-the-art models are online models - you have no choice, you have to pay for a black box subscription service controlled by one of a handful of third-party gatekeepers. What used to be a cost center that was inside your company is now a cost center outside your company, and thus it is a risk to become dependent on it. Perhaps the risk is worthwhile, perhaps not, but the hype is saying that real soon now it will be impossible to not become dependent on these closed systems and still exist as a viable company.
apwell23|9 months ago
For coding it seems to back itself into a corner and never recover from it until i "reset" it .
AI can't write software without an expert guiding it. I cannot open a non trivial PR to postgres tonight using AI.
simonw|9 months ago
100% true, but is that really what it would take for this to be useful today?
poincaredisk|9 months ago
Granted I was trying to do this 6 months ago, but maybe a miracle has happened. But I'm the past I had very bad experience with using LLMs for niche things (i.e. things that were never mentioned on stackoverflow)
simonw|9 months ago
I have no way of evaluating these myself so they might just be garbage slop.
AtlasBarfed|9 months ago
But for each nine of reliability you want out of llms everyone's assuming it's just a linear growth. I don't think it is. I think it's polynomial at least.
As for your tasks and maybe it's just cuz I'm using chat GPT, but I asked it to Port sed, something with full open source code availability, tons of examples/test cases, a fully documented user interface and I wanted it moved to Java as a library.
And it failed pretty spectacularly. Yeah it got the very very very basic functionality of sed.
kaydub|9 months ago
chinchilla2020|9 months ago
I've tried everything. I have four AI agents. They still have an accuracy rate of about 50%.
unknown|9 months ago
[deleted]
ipaddr|9 months ago
Tell me about this specific person who isn't famous
Create a facebook clone
Recreate Windows including drivers
Create a way to transport matter like in Star Trek.
I'll see you in 6 months.