Yep. Moments of sheer utility mixed with moments of "WTF were you 'thinking', if you can even call it that?'"
I've seen a lot of bad patterns, only some of which might be "trained out" with better training data in the future, and a lot of them revolve around testing:
1) Failure to stick to an existing valid test suite as a source of truth
2) Related: Failing to realize when the test is INvalid or has internally-inconsistent expectations (i.e., when to adjust the test)
3) Failure to run the full test suite right before declaring victory (you'd think it would be embarrassed... but it can't be embarrassed...)
4) Manually testing things instead of adding them to the test suite as test cases (which is almost always warranted)
5) When unable to solve something in a reasonable number of iterations, forcing the code to output what the test expects ("hardcoding the answer") instead of asking for help, then declaring partial victory (this one offended me the most, somehow, to the point that I was fascinated by how offended I was, like something I didn't even realize was sacred got violated)
6) Not sticking with TDD for more than 1 or 2 cycles before resorting to the above (this one is tragic because it would actually cause it to code better IMHO! Just like it would with the programmers who don't use it, creating the data it's training on! sigh)
7) not adhering to emphasized instructions (there's no way to "exclamation-point" certain directives without simply repeating them and/or using all-caps or threats etc... which is silly)
8) Writing a bunch of one-off test scripts/test data, and then not cleaning up after itself or merging those into the main test suite (if warranted)... It has ZERO sense of a "clean project directory" and that means I have to spend additional cycles of MY time either instructing it what to clean up (and hoping for the best) or manually going through everything it's produced and deciding what to keep or what to toss. Often these were artifacts that were valuable during intermediate steps, but are no longer, at least in a "this PR is wrapped up and ready for prod" sense.
In short, it knows the price of everything, but the value of nothing. As Sundar Pichai recently termed it, this is "artificial jagged intelligence (AJI)"
[shameless self-promo: I'm currently looking for interesting work, ping me, contact info in profile.]
I gave it a red hot try, ended up just turning off all the fancy predictive features, tried agentic mode... wasn't a fan, I still use Copilot occasionally to "rubber duck" ideas, and get some pointers on bugs...
I think we need to start being more nuanced in the way we describe "AI Coding tools".
In the same way Claude Code is a different beast to Cursor, my own process is a different beast to Claude Code and the months I've spent building out a robust pipeline is now paying dividends.
I also think someone at The Register needs to go on a statistics course. Those figures seem to paint the picture that an overwhelming majority of those surveyed have had positive outcomes, which I don't think is represented by the slightly snarky headline.
I like these tools and use them on a daily basis. That being said, the claimed benefits to productivity are way overblown. I find some folks wanting to cram them into every step of the dev process like they are some panacea.
They are a great boost but I think folks need to fit them in where they help naturally rather than cramming them into every nook and cranny.
pmarreck|8 months ago
I've seen a lot of bad patterns, only some of which might be "trained out" with better training data in the future, and a lot of them revolve around testing:
1) Failure to stick to an existing valid test suite as a source of truth
2) Related: Failing to realize when the test is INvalid or has internally-inconsistent expectations (i.e., when to adjust the test)
3) Failure to run the full test suite right before declaring victory (you'd think it would be embarrassed... but it can't be embarrassed...)
4) Manually testing things instead of adding them to the test suite as test cases (which is almost always warranted)
5) When unable to solve something in a reasonable number of iterations, forcing the code to output what the test expects ("hardcoding the answer") instead of asking for help, then declaring partial victory (this one offended me the most, somehow, to the point that I was fascinated by how offended I was, like something I didn't even realize was sacred got violated)
6) Not sticking with TDD for more than 1 or 2 cycles before resorting to the above (this one is tragic because it would actually cause it to code better IMHO! Just like it would with the programmers who don't use it, creating the data it's training on! sigh)
7) not adhering to emphasized instructions (there's no way to "exclamation-point" certain directives without simply repeating them and/or using all-caps or threats etc... which is silly)
8) Writing a bunch of one-off test scripts/test data, and then not cleaning up after itself or merging those into the main test suite (if warranted)... It has ZERO sense of a "clean project directory" and that means I have to spend additional cycles of MY time either instructing it what to clean up (and hoping for the best) or manually going through everything it's produced and deciding what to keep or what to toss. Often these were artifacts that were valuable during intermediate steps, but are no longer, at least in a "this PR is wrapped up and ready for prod" sense.
In short, it knows the price of everything, but the value of nothing. As Sundar Pichai recently termed it, this is "artificial jagged intelligence (AJI)"
[shameless self-promo: I'm currently looking for interesting work, ping me, contact info in profile.]
bl4ck1e|8 months ago
I don't know, I think I'm missing something.
oytis|8 months ago
matt3D|8 months ago
In the same way Claude Code is a different beast to Cursor, my own process is a different beast to Claude Code and the months I've spent building out a robust pipeline is now paying dividends.
I also think someone at The Register needs to go on a statistics course. Those figures seem to paint the picture that an overwhelming majority of those surveyed have had positive outcomes, which I don't think is represented by the slightly snarky headline.
JohnFen|8 months ago
That said, their headline does say that devs find the tools helpful, so I don't think they're misrepresenting anything.
tyleo|8 months ago
They are a great boost but I think folks need to fit them in where they help naturally rather than cramming them into every nook and cranny.
unknown|8 months ago
[deleted]
unknown|8 months ago
[deleted]