top | item 44033280

(no title)

d_watt | 9 months ago

The way Claude Code is going is exactly what I want out of a agentic coding tool with this "unix toolish" philosophy. I've been using Claude code since the initial public preview release, and have seen the direction over time.

The "golden" end state of coding agents is that you give it a Feature Request (EG Jira ticket), and it gives you a PR to review and give feedback on. Cursor, windsurf, etc, are dead ends in that sense as they are local editors, and can not be in CI.

If you are tooling your codebase for optimal AI usage (Rules, MCP, etc), you should target a technology that can bridge the gap to headless usage. The fact Claude Code can trivially be used as part of automation through the tools means it's now the default way I thinking about coding agents (Codex, the npm package, is the same).

Disclaimer, I focus on helping companies tool their codebases for optimal agent usage, so I might have a bias here to easily configurable tools.

discuss

jdmoreira|9 months ago

Not sure about that golden end state. Mine would be being in a room surround by screens with AI agents coding, designing, testing, etc. I would be there in the center giving guidance, direction, applying taste, etc… All conversational, wouldn’t need to touch the keyboard 99% of the time.

That's what I want and look forward one day

Roritharr|9 months ago

Is this a me thing, or a millenial thing?

I hate using voice for anything. I hate getting voice messages, I hate creating them. I get cold sweats just thinking about having to direct 10 AI Agents via voice. Just give me a keyboard and a bunch of screens, thanks.

csto12|9 months ago

If that’s the future, that means a massive reduction in software engineers no? What you are describing would require one technical product manager, not a team of software engineers.

geertj|9 months ago

I can easily see this happening in 2-3 years. Some chat apps already have outstanding voice mode, such as GPT-4o. It's just a matter of integrating that voice mode, and getting the understanding and generated code to be /slightly/ better than it is today.

rco8786|9 months ago

It seems unlikely that any one individual would be able to output a sufficient amount of context for that to not go off the rails really quickly (or just be extremely inefficient as most agents sit idle waiting for verification of their work)

cortesoft|9 months ago

Basically the Star Trek model of computing.

arguflow|9 months ago

In this "end state" what would the AI mind machine even have to code?

chamomeal|9 months ago

That sounds like torture for me lol

dakiol|9 months ago

No. The "golden" end state of coding agents is free and open source coding agents running on my machine (or in whatever machine I want). Can you imagine paying for every command you run in your terminal? For every `ls`, `ps`, `kill`? No sense, right? Well, same for LLMs.

I'm not saying "ban propietary LLMs", I'm saying: hackers (the ones that used to read sites like this) should have as their main tools free and open source ones.

dontlikeyoueith|9 months ago

> Can you imagine paying for every command you run in your terminal?

Yes, because hardware and electricity aren't free.

I literally DO pay for every command. I just don't get an itemized bill so there's no transparency about it. Instead, I made some lump-sum hardware payment which is amortized over the total usage I get out of it, plus some marginal increase in my monthly electric bill when I use it.

notpushkin|9 months ago

I agree with the sentiment, but isn’t Claude Code (the CLI) FOSS already? (Not sure it’s coupled to Claude the model API either, but if it is I imagine it’s not too hard to fix.)

sync|9 months ago

Anthropic also announced something along those lines today as well, in beta: https://docs.anthropic.com/en/docs/claude-code/github-action...

MattSayar|9 months ago

How did you find this? It doesn't pop up on any news sections on their site. I want to be on top of these kinds of things too!

breckenedge|9 months ago

> Cursor, windsurf, etc, are dead ends in that sense as they are local editors, and can not be in CI.

I was doing this with Cursor and MCPs. Got about a full day of this before I was rate limited and dropped to the slowest, dumbest model. I’ve done it with Claude too and quickly exhaust my rate limits. And the PRs are only “good to go” about 25% of the time, and it’s often faster to just do it right than find out where the AI screwed up.

andrewstuart|9 months ago

> The "golden" end state of coding agents is that you give it a Feature Request (EG Jira ticket), and it gives you a PR to review and give feedback on.

I see your point but in the other hand how depressing to be left only with the most soul crushing part of software entering - the Jira ticket.

d_watt|9 months ago

I personally find figuring out what the product should be is the fun part. There still a need for architecting a plan, but the actual act of writing code isn't what gives me personal joy, it's the building of something new.

I understand the craft of code itself is what some people love though!

btbuildem|9 months ago

Say what you will, but this would have the wonderful side effect of forcing people who write JIRA tickets to actually think through and clearly express what it is they want built.

pjmlp|9 months ago

The moment I am able to outsource work for Jira tickets to a level that AI actually delivers a reasonable pull request, many corporate managers will seriously wonder why keep the offshoring team around.

ryandrake|9 months ago

It seems like the Holy Grail here has become: "A business is one person, the CEO, sitting at his desk doing deals and directing virtual and physical agents to do accounting, run factories, manage R&D, run marketing campaigns, everything." That's it. A single CEO, (maybe) a lawyer, and a big AI/robotics bill = every business. No pesky employees to pay. That's the ultimate end game here, that's what these guys want. Is that what we want?

dgb23|9 months ago

So far, automation has only ever increased the need for software development. Jevons Paradox plus the recursive nature of software means that there's always more stuff to do.

The real threats to our profession are things like climate change, extreme wealth concentration, political instability, cultural regression and so on. It's the stuff that software stands on that one should worry about, not the stuff that it builds towards.

chrsw|9 months ago

Maybe I’m not think big picture enough… but have you ever tried using generative AI (i.e., a transformer) to create a circuit schematic? They fail miserably. Worse than Chat GPT-2 at generating text.

The current SOTA models can do some impressive things, in certain domains. But running a business is way more than generating JavaScript.

The way I see it, only some jobs will be impacted by generative AI in the near term. Not replaced, augmented.

yahoozoo|9 months ago

Why would they pay you six figures to outsource to AI when they could pay offshore a fraction of that to do the same?

StefanBatory|9 months ago

Offshoring team?

No, any team.

k__|9 months ago

Can't you have that already?

Put the Aider CLI into a GitHub action that's triggered by an issue creation and you're good to go.

d_watt|9 months ago

Aider is definitely in the same camp. Last time I checked, they weren't optimizing for the full "agent infinitely looping until completion" usecase, and didn't have MCP support.

But it's 100% the same class of tool and the awesome part of the unixy model is hopefully agents can be substituted in for each other in your pipeline for whichever one is better for the usecase, just like models are interoperable.

alvis|9 months ago

The vision of submitting a feature request and receiving a ready-to-review PR is equally compelling and horrifying from the standpoint of strategy management.

Like Anthropic and most big tech companies, they don't want to show off the best until they need to. They used to stockpile some cool features, and they have time to think about their strategy. But now I feel like they are in a rush to show off everything and I'm worried whether the management has time to think about the big picture.

arkadiytehgraet|9 months ago

You should use some of those agents yourself to fix some glaring issues at your landing page.

morsecodist|9 months ago

Setting aside predictions about the future and what is best for humanity and all that for a moment this is just such a bummer on a personal level. My whole job would become the worst parts of my job.

unknown|9 months ago

[deleted]

max_on_hn|9 months ago

(please pardon the self-promotion) This is exactly what my product https://cheepcode.com does (connects to your Linear/Jira/etc and submits PRs to GitHub) - I agree that’s the golden state, and that’s why I’m rushing to get out of private beta as fast as I can* :) It’s a bootstrapped operation right now which limits my speed a bit but this is the vision I’ve been working towards for the past few months.

*I have a few more safety/scalability changes to make but expecting public launch in a few weeks!

virgildotcodes|9 months ago

> The "golden" end state of coding agents is that you give it a Feature Request (EG Jira ticket), and it gives you a PR to review and give feedback on. Cursor, windsurf, etc, are dead ends in that sense as they are local editors, and can not be in CI.

Isn’t that effectively the promise of the most recently released OpenAI codex?

From the reviews I’ve been able to find so far though, quality of output is ehh.

d_watt|9 months ago

It totally is!

I bias a bit to wanting the agent to be a pluggable component into a flow I own, rather than a platform in a box.

It'll be interesting to see where the different value props/use cases of a Delvin/v0 vs a Codex Cloud vs Claude Code/Codex CLI vs Cursor land.

ramesh31|9 months ago

Thats the promise. The reality is that it's just a subpar version of Claude Code which doesn't support MCP.

mistrial9|9 months ago

golden age consultant paycheck

naiv|9 months ago

played around with connecting https://github.com/eyaltoledano/claude-task-master via mcp to create a prd which basically replaces the ticket grooming process and then executing it with claude code creating a branch named like the ticket and pushing after having created the unit tests and constant linting.