top | item 41668203

(no title)

And this is the major problem. People will blindly trust the output of AI because it appears to be amazing, this is how mistakes slip in. It might not be a big deal with the app you're working on, but in a banking app or medical equipment this can have a huge impact.

discuss

Gigachad|1 year ago

I feel like I’m being gaslit about these AI code tools. I’ve got the paid copilot through work and I’ve just about never had it do anything useful ever.

I’m working on a reasonably large rails app and it can’t seem to answer any questions about anything, or even auto fill the names of methods defined in the app. Instead it just makes up names that seem plausible. It’s literally worse than the built in auto suggestions of vs code, because at least those are confirmed to be real names from the code.

Maybe these tools work well on a blank project where you are building basic login forms or something. But certainly not on an established code base.

nucleardog|1 year ago

I'm in the same boat. I've tried a few of these tools and the output's generally been terrible to useless big and small. It's made up plausible-sounding but non-existent methods on the popular framework we use, something which it should have plenty of context and examples on.

Dealing with the output is about the same as dealing with a code review for an extremely junior employee... who didn't even run and verify their code was functional before sending it for a code review.

Except here's the problem. Even for intermediate developers, I'm essentially always in a situation where the process of explaining the problem, providing feedback on a potential solution, answering questions, reviewing code and providing feedback, etc takes more time out of my day than it would for me to just _write the damn code myself_.

And it's much more difficult for me to explain the solution in English than in code--I basically already have the code in my head, now I'm going through a translation step to turn it into English.

All adding AI has done is taking the part of my job that is "think about problem, come up with solution, type code in" and make it into something with way more steps, all of which are lossy as far as translating my original intent to working code.

I get we all have different experiences and all that, but as I said... same boat. From _my_ experiences this is so far from useful that hearing people rant and rave about the productivity gains makes me feel like an insane person. I can't even _fathom_ how this would be helpful. How can I not be seeing it?

kgeist|1 year ago

For me, AI is super helpful with one-off scripts, which I happen to write quite often when doing research. Just yesterday, I had to check my assumptions are true about a certain aspect of our live system and all I had was a large file which had to be parsed. I asked ChatGPT to write a script which parses the data and presents it in a certain way. I don't trust ChatGPT 100%, so I reviewed the script and checked it returned correct outputs on a subset of data. It's something which I'd do to the script anyway if I wrote it myself, but it saved me like 20 minutes of typing and debugging the code. I was in a hurry because we had an incident that had to be resolved as soon as possible. I haven't tried it on proper codebases (and I think it's just not possible at this moment) but for quick scripts which automate research in an ad hoc manner, it's been super useful for me.

Another case is prototyping. A few weeks ago I made a prototype to show to the stakeholders, and it was generally way faster than if I wrote it myself.

thewarrior|1 year ago

It’s writing most of my code now. Even if it’s existing code you can feed in the 1-2 files in question and iterate on them. Works quite well as long as you break it down a bit.

It’s not gas lighting the latest versions of GPT, Claude, Lama have gotten quite good

koliber|1 year ago

My experience is anecdotal, based on a sample size of one. I'm not writing to convince, but to share. Please take a look at my resume to see my background, so you can weight what I write.

I tried cursor because a technically-minded product manager colleague of mine managed to build a damned solid MVP of an AI chat agent with it. He is not a programmer, but knows enough to kick the can until things work. I figured if it worked for him, I might invest an hour of my time to check it out.

I went in with a time-boxed one hour time to install cursor and implement a single trivial feature. My app is not very sophisticated - mostly a bunch of setup flows and CRUD. However, there are some non-trivial things which I would expect to have documented in a wiki if I was building this with a team.

Cursor did really well. It generated code that was close to working. It figured out those not-obvious bits as well and the changes it made kept them in mind. This is something I would not expect from a junior dev, had I not explained those cross-dependencies to them (mostly keeping state synchronized according to business rule across different entities).

It did a poor job of applying those changes to my files. It would not add the code it generated in the right places and mess things up along the way. I felt I was wrestling with it a but too much to my liking. But once I figured this out I started hand-applying it's changes and reviewing them as I incorporated them into my code. This workflow was beautiful.

It was as if I sent a one paragraph description of the change I want, and received a text file with code snippets and instructions where to apply them.

I ended up spending four hours with cursor and giving it more and more sophisticated changes and larger features to implement. This is the first AI tool I tried where I gave it access to my codebase. I picked cursor because I've heard mixed reviews about others, and my time is valuable. It did not disappoint.

I can imagine it will trip up on a larger codebase. These tools are really young still. I don't know about other AI tools, and am planning on giving them a whirl in the near future.

brandall10|1 year ago

Copilot is terrible. You need to use Cursor or at the very least Continue.dev w/ Claude Sonnet 3.5.

It's a massive gulf of difference.

Kiro|1 year ago

That sounds almost like the complete opposite of my experience and I'm also working in a big Rails app. I wonder how our experiences can be so diametrically different.

koliber|1 year ago

OP here. I am explicitly NOT blindly trusting the output of the AI. I am treating it as a suspicious set of code written by an inexperienced developer. Doing full code review on it.

svara|1 year ago

I don't think this criticism is valid at all.

What you are saying will occasionally happen, but mistakes already happen today.

Standards for quality, client expectations, competition for market share, all those are not going to go down just because there's a new tool that helps in creating software.

New tools bring with them new ways to make errors, it's always been that way and the world hasn't ended yet...