(no title)
not_that_d | 6 months ago
1. It helps to get me going with new languages, frameworks, utilities or full green field stuff. After that I expend a lot of time parsing the code to understand what it wrote that I kind of "trust" it because it is too tedious but "it works".
2. When working with languages or frameworks that I know, I find it makes me unproductive, the amount of time I spend writing a good enough prompt with the correct context is almost the same or more that if I write the stuff myself and to be honest the solution that it gives me works for this specific case but looks like a junior code with pitfalls that are not that obvious unless you have the experience to know it.
I used it with Typescript, Kotlin, Java and C++, for different scenarios, like websites, ESPHome components (ESP32), backend APIs, node scripts etc.
Botton line: usefull for hobby projects, scripts and to prototypes, but for enterprise level code it is not there.
brulard|6 months ago
Most recently I ask first CC to create a design document for what we are going to do. He has instructions to look into the relevant parts of the code and docs to reference them. I review it and few back-and-forths we have defined what we want to do. Next step is to chunk it into stages and even those to smaller steps. All this may take few hours, but after this is well defined, I clear the context. I then let him read the docs and implement one stage. This goes mostly well and if it doesn't I either try to steer him to correct it, or if it's too bad, I improve the docs and start this stage over. After stage is complete, we commit, clear context and proceed to next stage.
This way I spend maybe a day creating a feature that would take me maybe 2-3. And at the end we have a document, unit tests, storybook pages, and features that gets overlooked like accessibility, aria-things, etc.
At the very end I like another model to make a code review.
Even if this didn't make me faster now, I would consider it future-proofing myself as a software engineer as these tools are improving quickly
imiric|6 months ago
Yet even following it to a T, and being really careful with how you manage context, the LLM will still hallucinate, generate non-working code, steer you into wrong directions and dead ends, and just waste your time in most scenarios. There's no magical workflow or workaround for avoiding this. These issues are inherent to the technology, and have been since its inception. The tools have certainly gotten more capable, and the ecosystem has matured greatly in the last couple of years, but these issues remain unsolved. The idea that people who experience them are not using the tools correctly is insulting.
I'm not saying that the current generation of this tech isn't useful. I've found it very useful for the same scenarios GP mentioned. But the above issues prevent me from relying on it for anything more sophisticated than that.
aatd86|6 months ago
Also the longer the conversation goes, the less effective it gets. (saturated context window?)
john-tells-all|6 months ago
https://martinfowler.com/articles/2023-chatgpt-xu-hao.html
ramshanker|6 months ago
viccis|6 months ago
For legacy systems, especially ones in which a lot of the things they do are because of requirements from external services (whether that's tech debt or just normal growing complexity in a large connected system), it's less useful.
And for tooling that moves fast and breaks things (looking at you, Databricks), it's basically worthless. People have already brought attention to the fact that it will only be as current as its training data was, and so if a bunch of terminology, features, and syntax have changed since then (ahem, Databricks), you would have to do some kind of prompt engineering with up to date docs for it to have any hope of succeeding.
pvorb|6 months ago
jeremywho|6 months ago
I give claude the full path to a couple of relevant files related to the task at hand, ie where the new code should hook into or where the current problem is.
Then I ask it to solve the task.
Claude will read the files, determine what should be done and it will edit/add relevant files. There's typically a couple of build errors I will paste back in and have it correct.
Current code patterns & style will be maintained in the new code. It's been quite impressive.
This has been with Typescript and C#.
I don't agree that what it has produced for me is hobby-grade only...
taberiand|6 months ago
This way helps ensure it works on manageable amounts of code at a time and doesn't overload its context, but also keeps the bigger picture and goal in sight.
hamandcheese|6 months ago
JyB|6 months ago
Next you use claude code instead and you make several work on their own clone on their own workspace and branches in the background; So you can still iterate yourself on some other topic on your personal clone.
Then you check out its tab from time to time and optionally checkout its branch if you'd rather do some updates yourself. It's so ingrained in my day-to-day flow now it's been super impressive.
nwatson|6 months ago
alfalfasprout|6 months ago
As a result, their productivity might go up on simple "ticket like tasks" where it's basically just simple implementation (find the file(s) to edit, modify it, test it) but when they start using it for all their tasks suddenly they don't know how anything works. Or worse, they let the LLM dictate and bad decisions are made.
These same people are also very dogmatic on the use of these tools. They refuse to just code when needed.
Don't get me wrong, this stuff has value. But I just hate seeing how it's made many engineers complacent and accelerated their ability to add to tech debt like never before.
pqs|6 months ago
chamomeal|6 months ago
zingar|6 months ago
dekhn|6 months ago
In the few weeks since I've started using Gemini/ChatGPT/Claude, I've
1. had it read my undergrad thesis and the paper it's based on, implementing correct pytorch code for featurization and training, along wiht some aspects of the original paper that I didn't include in my thesis. I had been waiting until retirement until taking on this task.
2. had it write a bunch of different scripts for automating tasks (typically scripting a few cloud APIs) which I then ran, cleaning up a long backlog of activities I had been putting off.
3. had it write a yahtzee game and implement a decent "pick a good move" feature . It took a few tries but then it output a fully functional PyQt5 desktop app that played the game. It beat my top score of all time in the first few plays.
4. tried to convert the yahtzee game to an android app so my son and I could play. This has continually failed on every chat agent I've tried- typically getting stuck with gradle or the android SDK. This matches my own personal experience with android.
5. had it write python and web-based g-code senders that allowed me to replace some tools I didn't like (UGS). Adding real-time vis of the toolpath and objects wasn't that hard either. Took about 10 minutes and it cleaned up a number of issues I saw with my own previous implementations (multithreading). It was stunning how quickly it can create fully capable web applications using javascript and external libraries.
6. had it implement a gcode toolpath generator for basic operations. At first I asked it to write Rust code, which turned out to be an issue (mainly because the opencascade bindings are incomplete), it generated mostly functional code but left it to me to implement the core algorithm. I asked it to switch to C++ and it spit out the correct code the first time. I spent more time getting cmake working on my system than I did writing the prompt and waiting for the code.
7. had it Write a script to extract subtitles from a movie, translate them into my language, and re-mux them back into the video. I was able to watch the movie less than an hour after having the idea- and most of that time was just customizing my prompt to get several refinements.
8. had it write a fully functional chemistry structure variational autoencoder that trains faster and more accurate than any I previously implemented.
9. various other scientific/imaging/photography related codes, like impleemnting multi-camera rectification, so I can view obscured objects head-on from two angled cameras.
With a few caveats (Android projects, Rust-based toolpath generation), I have been absolutely blown away with how effective the tools are (especially used in a agent which has terminal and file read/write capabilities). It's like having a mini-renaissance in my garage, unblocking things that would have taken me a while, or been so frustrating I'd give up.
I've also found that AI summaries in google search are often good enough that I don't click on links to pages (wikipedia, papers, tutorials etc). The more experience I get, the more limitations I see, but many of those limitations are simply due to the extraordinary level of unnecessary complexity required to do nearly anything on a modern computer (see my comments about about Android apps & gradle).
MangoCoffee|6 months ago
I use GitHub Copilot. I recently did a vibe code hobby project for a command line tool that can display my computer's IP, hard drive, hard drive space, CPU, etc. GPT 4.1 did coding and Claude did the bug fixing.
The code it wrote worked, and I even asked it to create a PowerShell script to build the project for release
apimade|6 months ago
For developers deeply familiar with a codebase they’ve worked on for years, LLMs can be a game-changer. But in most other cases, they’re best for brainstorming, creating small tests, or prototyping. When mid-level or junior developers lean heavily on them, the output may look useful.. until a third-party review reveals security flaws, performance issues, and built-in legacy debt.
That might be fine for quick fixes or internal tooling, but it’s a poor fit for enterprise.
bityard|6 months ago
typpilol|6 months ago
My eslint config is a mess but the code it writes comes out pretty good. Although it makes a few iterations after the lint errors pop for it to rewrite it, the code it writes is way better.
Aeolun|6 months ago
hoppp|6 months ago
Using it with rust is just horrible imho. Lots and lots of errors, I cant wait to stop this rust project already. But the project itself is quite complex
Go on the other hand is super productive, mainly because the language is already very simple. I can move 2x fast
Typescript is fine, I use it for react components and it will do animations Im lazy to do...
SQL and postgresql is fine, I can do it without it also, I just dont like to write stored functions cuz of the boilerplatey syntax, a little speed up saves me from carpal tunnel
jiggawatts|6 months ago
epolanski|6 months ago
- step A: ask AI to write a featureA-requirements.md file at the root of the project, I give it a general description for the task, then have it ask me as many questions as possible to refine user stories and requirements. It generally comes up with a dozen or more of questions, of which multiples I would've not thought about and found out much later. Time: between 5 and 40 minutes. It's very detailed.
- step B: after we refine the requirements (functional and non functional) we write together a todo plan as featureA-todo.md. I refine the plan again, this is generally shorter than the requirements and I'm generally done in less than 10 minutes.
- step C: implementation phase. Again the AI does most of the job, I correct it at each edit and point flaws. Are there cases where I would've done that faster? Maybe. I can still jump in the editor and do the changes I want. This step in general includes comprehensive tests for all the requirements and edge cases we have found in step A, both functional, integration and E2Es. This really varies but it is generally highly tied to the quality of phase A and B. It can be as little as few minutes (especially true when we indeed come up with the most effective plan) and as much as few hours.
- step D: documentation and PR description. With all of this context (in requirements and todos) at this point updating any relevant documentation and writing the PR description is a very short experiment.
In all of that: I have textual files with precise coding style guidelines, comprehensive readmes to give precise context, etc that get referenced in the context.
Bottom line: you might be doing something profoundly wrong, because in my case, all of this planning, requirements gathering, testing, documenting etc is pushing me to deliver a much higher quality engineering work.
mcintyre1994|6 months ago
drums8787|6 months ago
Once the code is written, review, test and done. And on to more fun things.
Maybe what has made it work is that these tasks have all fit comfortably within existing code patterns.
My next step is to break down bigger & more complex changes into claude friendly bites to save me more grunt work.
unlikelytomato|6 months ago
On the other hand, it does cost me about 8 hours a week debugging issues created by bad autocompletes from my team. The last 6 months have gotten really bad with that. But that is a different issue.
flowerthoughts|6 months ago
If LLMs maintain the code, the API boundary definitions/documentation and orchestration, it might be manageable.
urbandw311er|6 months ago
Obviously there’s still other reasons to create micro services if you wish, but this does not need to be another reason.
fsloth|6 months ago
arwhatever|6 months ago
You could then put all services in 1 repo, or point LLM at X number of folders containing source for all X services, but then it doesn’t seem like you’ll have gained anything, and at the cost of added network calls and more infra management.
stpedgwdgfhgdd|6 months ago
The prompt needs to be good, but in plan mode it will iteratively figure it out.
You need to have automated tests. For enterprise software development that actually goes without saying.
dclowd9901|6 months ago
mnky9800n|6 months ago
https://open.substack.com/pub/mnky9800n/p/coding-agents-prov...
johnisgood|6 months ago
It is good for me in Go but I had to tell it what to write and how.
sdesol|6 months ago
It is also incredibly important to note that the 5% that I needed to figure out was the difference between throw away code and something useful. You absolutely need domain knowledge but LLMs are more than enterprise ready in my opinion.
Here is some documentation on how my search solution is used in my app to show that it is not a hobby feature.
https://github.com/gitsense/chat/blob/main/packages/chat/wid...
tonyhart7|6 months ago
when you stuck at claude doing dumb shit, you didnt give the model enough context to know better the system
after following spec driven development, works with LLM in large code base make it so much easier than without it like its heaven and hell differences
but also it increase in token cost exponentially, so there's that
fpauser|6 months ago
j45|6 months ago
amelius|6 months ago
MarcelOlsz|6 months ago
risyachka|6 months ago
I usually go to option 2 - just write it by myself as it is same time-wise but keeps skills sharp.
fpauser|6 months ago
therealpygon|6 months ago
I get why, it’s a test of just how intuitive the model can be at planning and execution which drives innovation more than 1% differences in benchmarks ever will. I encourage that innovation in the hobby arena or when dogfooding your AI engineer. But as a replacement developer in an enterprise where an uncaught mistake could cost millions? No way. I wouldn’t even want to be the manager of the AI engineering team, when they come looking for the only real person to blame for the mistake not being caught.
For additional checks/tasks as a completely extra set of eyes, building internal tools, and for scripts? Sure. It’s incredibly useful with all sorts of non- application development tasks. I’ve not written a batch or bash script in forever…you just don’t really need to much anymore. The linear flow of most batch/bash/scripts (like you mentioned) couldn’t be a more suitable domain.
Also, with a basic prompt, it can be an incredibly useful rubber duck. For example, I’ll say something like “how do you think I should solve x problem”(with tools for the codebase and such, of course), and then over time having rejected and been adversarial to every suggestion, I end up working through the problem and have a more concrete mental design. Think “over-eager junior know-it-all that tries to be right constantly” without the person attached and you get a better idea of what kind of LLM output you can expect including following false leads to test your ideas. For me it’s less about wanting a plan from the LLM, and more about talking through the problems I think my plan could solve better, when more things are considered outside the LLMs direct knowledge or access.
“We can’t do that, changing X would break Y external process because Z. Summarize that concern into a paragraph to be added to the knowledge base. Then, what other options would you suggest?”