In my own anecdotal experience Claude Code found a bug in production faster than I could. I was the author of the said code, that was written 4 years ago by hand. GPs claim perhaps is not all that unsubstantiated. My role is moving more towards QA/PM nowadays.
verdverm|1 month ago
Do you have fail hards to share along with your wins? Are we going to only share our wins like stonk hussies?
throwaway7783|1 month ago
These incidents have been less and less over the last year - switching it Opus made failure frequencies less. Same thing for code reviews. Most of it is fluff, but it does give useful feedback, if the instructions are good. For example, I asked for a blind code review of a PR ("Review this PR"), and it gave some generic commentary. I made the prompt more specific ("Follow the API changes across modules and see impact") - it found a serious bug.
The number of times I had to give up in frustration has been going down over the last one year. So I tend believe a swarm of agents could do a decent job of autonomous development/maintenance over the next few years.