top | item 45357230

(no title)

Looking at my own use of AI, and at how I see other engineers use it, it often feels like two steps forward and two steps back, and overall not a lot of real progress yet.

I see people using agents to develop features, but the amount of time they spend to actually make the agent do the work usually outweighs the time they’d have spent just building the feature themselves. I see people vibe coding their way to working features, but when the LLM gets stuck it takes long enough for even a good developer to realize it and re-engage their critical thinking that it can wipe out the time savings. Having an LLM do code and documentation review seems to usually be a net positive to quality, but that’s hard to sell as a benefit and most people seem to feel like just using the LLM to review things means they aren’t using it enough.

Even for engineers there are a lot of non-engineering benefits in companies that use LLMs heavily for things like searching email, ticketing systems, documentation sources, corporate policies, etc. A lot of that could have been done with traditional search methods if different systems had provided better standardized methods of indexing and searching data, but they never did and now LLMs are the best way to plug an interoperability gap that had been a huge problem for a long time.

My guess is that, like a lot of other technology driven transformations in how work gets done, AI is going to be a big win in the long term, but the win is going to come on gradually, take ongoing investment, and ultimately be the cumulative result of a lot of small improvements in efficiency across a huge number of processes rather than a single big win.

discuss

ernst_klim|5 months ago

> the amount of time they spend to actually make the agent do the work usually outweighs the time they’d have spent just building the feature themselves

Exactly my experience. I feel like LLMs have potential as Expert Systems/Smart websearch, but not as a generative tool, neither for code nor for text.

You spend more time understanding stuff than writing code, and you need to understand what you commit with or without LLM. But writing code is easier that reviewing, and understanding by doing is easier than understanding by reviewing (bc you get one particular thing at the time and don't have to understand the whole picture at once). So I have a feeling that agents do even have negative impact.

spwa4|5 months ago

The reason companies, or at least sales and marketing, are so incredibly after AI is that it can raise response rates on spam, and on ads, by "Hyper-personalizing" them by actually reading the social media accounts of the people looking at the ads and making ads directly based on that.

breakpointalpha|5 months ago

Your millage may vary, but I just got Cursor (using Claude 4 Sonnet) to one shot a sequence of bash scripts that cleanup stale AWS resources. I pasted the Jira ticket description that I wrote, with a few examples and the script works perfectly. Saved me a few hours of bash writing and debugging because I can read bash, but not write it well.

It seems that the smaller the task and the more tightly defined the input and output, the better the LLMs are at one-shotting.

rebeccaskinner|5 months ago

I’ve had similar experiences where AI saved me a ton of time when I knew what I wanted and understood the language or library well enough to review but poorly enough that I’d gave been slow writing it because I’d have spent a lot of time looking things up.

I’ve also had experiences where I started out well but the AI got confused, hallucinated, or otherwise got stuck. At least for me those cases have turned pathological because it always _feels_ like just one or two more tweaks to the prompt, a little cleanup, and you’ll be done, but you can end up far down that path before you realize that you need to step back and either write the thing yourself or, at the very least, be methodical enough with the AI that you can get it to help you debug the issue.

The latter case happens maybe 20% of the time for me, but the cost is high enough that it erases most of the time savings I’ve seen in the happy path scenario.

It’s theoretically easy to avoid by just being more thoughtful and active as a reviewer, but that reduces the efficiency gain in the happy path. More importantly, I think it’s hard to do for the same reason partially self driving cars are dangerous: humans are bad at paying attention well in “mostly safe and boring, occasionally disastrous” type settings.

My guess is that in the end we’ll see less of the problematic cases. In part because AI improves, and in part because we’ll develop better intuition for when we’ve stepped onto the unproductive path. I think a lot of it too will also be that we adopt ways of working that minimize the pathological “lost all day to weird LLM issues” problems by trying to keep humans in the loop more deeply engaged. That will necessarily also reduce the maximum size of the wins we get, but we’ll come away with a net positive gain in productivity.

washadjeffmad|5 months ago

Same. I interface with a team who refuses to conduct business in anything other than Excel, and because of dated corporate mindshare, their management sees them more as wizards instead of the odd ones out.

"They're on top of it! They always email me the new file when they make changes and approve my access requests quickly."

There are limits to my stubbornness, and my first use of LLMs for coding assistance was to ask for help figuring out how to Excel, after a mere three decades of avoidance.

After engaging and learning more about their challenges, it turned out one of their "data feeds" was actually them manually copy/pasting into a web form with a broken batch import that they'd give up on submitting project requests for, which I quietly fixed so they got to retain their turnaround while they planned some other changes.

Ultimately nothing grand, but I would never have bothered if I'd had to wade through the usual sort of learning resources available or ask another person. Being able to transfer and translate higher level literacy, though, is right up my alley.

jdiff|5 months ago

That's a dangerous game to play with Bash, I'm not sure if there's another language more loaded with footguns than that.

insane_dreamer|5 months ago

I've found it to be a significant productivity boost but only for a small subset of problems. (Things like bash scripts, which are tedious to write and I'm not that great at bash. Or fixing small bugs in a React app, a framework I'm not well versed in. But even then I have to keep my thinking cap on so it doesn't go off the rails.)

It works best when the target is small and easily testable (without the LLM being able to fudge the tests, which it will do.)

For many other tasks it's like training an intern, which is worth it if the intern is going to grow and take on more responsibility and learn to do things correctly. But since the LLM doesn't learn from its mistakes, it's not a clear and worthwhile investment.

DanielHB|5 months ago

I have found out that the limit of LLMs good use of coding abilities is basically what can be reasonably done as a single copy-paste. Usually only individual functions.

I basically use it for google on steroids for obscure topics, for simple stuff I still use normal search engines.