top | item 46995455

(no title)

I don't disagree with you entirely here. I probably wasn't clear enough on what I was trying to convey.

Right now AI / Agentic coding doesn't seem is a train we are going to be able to stop; and at the end of the day is tool like any other. Most of what seems to be happening is people let AI fully take the wheel not enough specs, not enough testing, not enough direction.

I keep experiment and tweaking how much direction to give AI in order to product less fuckery and more productive code.

discuss

_dwt|17 days ago

Sorry for coming off combative - I'm mostly fatigued from "criti-hype" pieces we've been deluged with the last week. For what it's worth I think you're right about the inevitability but I also think it's worth pushing a bit against the pre-emptive shaping of the Overton window. I appreciate the comment.

I don't know how to encourage the kind of review that AI code generation seems to require. Historically we've been able to rely on the fact that (bluntly) programming is "g-loaded": smart programmers probably wrote better code, with clearer comments, formatted better, and documented better. Now, results that look great are a prompt away in each category, which breaks some subconscious indicators reviewers pick up on.

I also think that there is probably a sweet spot for automation that does one or two simple things and fails noisily outside the confidence zone (aviation metaphor: an autopilot that holds heading and barometric altitude and beeps loudly and shakes the stick when it can't maintain those conditions), and a sweet spot for "perfect" automation (aviation metaphor: uh, a drone that autonomously flies from point A to point B using GPS, radar, LIDAR, etc...?). In between I'm afraid there be dragons.

allanmacgregor|15 days ago

@_dwt don't worry you didn't I appreciate good discussion and criticism. The publication is new and I'm still trying to calibrate my voice and style for it.

>I don't know how to encourage the kind of review that AI code generation seems to require. Historically we've been able to rely on the fact that (bluntly) programming is "g-loaded": smart programmers probably wrote better code, with clearer comments, formatted better, and documented better. Now, results that look great are a prompt away in each category, which breaks some subconscious indicators reviewers pick up on.

I don't anyone knows for sure, we all are on the same boat trying to figure it out how to best work with AI; the pace of change is making it so incredibly difficult to keep or try things. I'm trying a bunch of stuff at the same time:

-https://structpr.dev/ - to try to rethink how we approach PR reading, organizing review (dog-fooding it right now so is mostly alpha)

- I have an article schedule next week talking about StrongDMs Software factory, there are some interesting ideas there like test holdouts

- Some experiments in the Elixir stack for code generation and verification that go beyond it looks great. AI can definetively create code that _looks_ great but there is plenty of research that shows a lot of AI generated code and test can have a high degree of false confidence.