top | item 45649115

(no title)

mdeeks | 4 months ago

I feel like these background agents still aren't doing what I want from a developer experience perspective. Running in an inaccessible environment that pushes random things to branches that I then have to checkout locally doesn't feel great.

AI coding should be tightly in the inner dev loop! PRs are a bad way to review and iterate on code. They are a last line of defense, not the primary way to develop.

Give me an isolated environment that is one click hooked up to Cursor/VSCode Remote SSH. It should be the default. I can't think of a single time that Claude or any other AI tool nailed the request on the first try (other than trivial things). I always need to touch it up or at least navigate around and validate it in my IDE.

discuss

ewoodrich|4 months ago

Right, that is closer to what I was hoping this announcement would be. I really just want a (mobile/web) companion to whatever CLI environment I have Claude Code running in. That would perfectly fill in the exact niche missing in my local dev server VM setup I remote into with any combination of SSH, VS Code Remote, or via Web (VS Code Tunnel from vscode.dev and a ttyd remote CLI session in the browser).

It would be great to be able to check in on Claude on a walk or something to make sure it hasn't gone off the rails or send it a quick "LGTM" to keep moving down a large PLAN.md file without being tethered to a keyboard and monitor. I can SSH from my phone but the CLI ergonomics are ... not great with an on screen keyboard, when all it really needs is just needs a simple threaded chat UI.

I've seen a couple Github projects and "Happy Coder" on a Show HN which I haven't got around to setting up yet which seem in the ballpark of what I want, but a first party integration would always be cool.

Yeroc|4 months ago

I tried Happy Coder for a bit. It seemed exactly what I was missing but about 1/2 the time session notifications weren't coming through and the developers of the tool seem busy pushing it off in other directions rather than in making the core functionality bullet-proof so I gave up on it. Unfortunate. Hopefully something else pops up or Anthropic bakes it into their own tooling.

luisml77|4 months ago

I agree and I also think the problem is deeper than that. It's about not being able to do most code testing and debugging remotely. You can't really test anything remotely really... Its in an ephemeral container without any of your data, just your repo. You can't have the model do npm run dev and browse to see the webpage, click around, etc. You can't compile or run anything heavy, you can't persist data across sessions/days, etc.

I like the idea of background agents running in the cloud but it has to be a more persistent environment. It also has to run on a GUI so it can develop web applications or run the programs we are developing, and run them properly with the GUI and requiring clicking around, typing things etc. Computer use, is what we need. But that would probably be too expensive to serve to the masses with the current models

daxfohl|4 months ago

Definitely sounds cool. But the problem hasn't even been solved locally yet. Distributed microservices, 3rd party dependencies, async callbacks, reasonable test data, unsatisfiable validations, etc. Every company has their own hacked together local testing thing that mostly doesn't work.

That said, maybe this is the turning point where these companies work toward solving it in earnest, since it's a key differentiator of their larger PLATFORM and not just a cost. Heck, if they get something like that working well, I'd pay for it even without the AI!

Edit: that could end up being really slick too if it was able to learn from your teammates and offer guidance. Like when you're checking some e2e UI flows but you need a test item that has some specific detail, it maybe saw how your teammate changed the value or which item they used or created, and can copy it for you. "Hey it looks like you're trying to test this flow. Here's how Chen did it. Want me to guide you through that?" They can't really do that with just CLI, so the web interface could really be a game changer if they take full advantage of it.

mdeeks|4 months ago

What you're describing feels like the next major evolution and is likely years away (and exciting!).

I'm mainly aiming for a good experience with what we have today. Welding an AI agent onto my IDE turned out to be great. The next incremental step feels like being able to parallelize that. I want four concurrent IDEs with AI welded onto it.

elpakal|4 months ago

> PRs are a bad way to review and iterate on code

idk, we’ve (humans) gotten this far with them. I don’t think they are the right tool for AI generated code and coding agents though, and that these circles are being forced to fit into those squares. imho it’s time for an AI-native git or something.

mdeeks|4 months ago

PRs work well for what they are. Ship off some changes you're strongly confident about and have another human who has a lot of context read through it and double check you. It's for when you think you've finished your inner loop.

AI is more akin to pair programming with another person sitting next to you. I don't want to ship a PR or even a branch off to someone sitting next to me. I want to discuss and type together in real time.

sails|4 months ago

Agree, each agent creating a PR and then coordinating merges is a pain.

I’d like

- agent to consolidate simple non-conflicting PRs

- faster previews and CI tests (Render currently)

- detect and suggest solutions for merge conflicts

Codex web doesn’t update the PR which is also something to change, maybe a setting, but for web Code agents (?) I’d like the PR once opened to stay open

Also PRs need an overhaul in general. I create lots of speculative agents, if I like the solution I merge, leading to lots of PRs

archon810|4 months ago

Thank you. Every time these agentic cloud tools come out, I wonder to myself whether I'm not using them right or misunderstand vs, say, local Cursor development paradigm.

Plus they generate so much noise with all the extra commits and comments that go to everyone in slack and email rather than just me.

icelancer|4 months ago

I just run the agent directly on separate testing/dev servers via remote-ssh in VS Code to have an IDE to sanity check stuff. Just far simpler than local dev and other nonsense.

cyrusradfar|4 months ago

this is a great point. The inner / outer loop is big. I think AI pushing PRs is kind of like pushing drafts to the public in social media. I don't want folks seeing PRs and such until I feel good about it. It adds a lot of noise, and increases build costs unless your CI/CD treats them differently which I don't know anyone doing.

justinram11|4 months ago

Have you checked out Ona [1] (gitpod's pivot)?

[1] https://ona.com/

mdeeks|4 months ago

This is possibly what I want? It's hard to tell from all of the marketing on the site.

I want to run a prompt that operates in an isolated environment that is open in my IDE where I can iterate with the AI. I think maybe it can do this?

pdntspa|4 months ago

Yes but pointy-haired bosses are much more amenable to the sales pitch of, "insert story, receive PR"

asdev|4 months ago

so the biggest issue is having to pull down and manually edit changes? can't you just @claude on the PR to make any changes?

mdeeks|4 months ago

Yes, but my point is often times I don't want to. Sometimes there are changes I can make it seconds. I don't want to wait 15+ seconds for an AI that might do it wrong or do too much.

Also it isn't always about editing. It is about seeing the surrounding code, navigating around, and ensuring the AI did the right thing in all of the right places.

TechDebtDevin|4 months ago

Huge waste of time. You are being sold a bill of goods whose only purpose is to make you a dumb dev. Like woah, an llm can use cdp!! Who cares. Cant wait till people start waking up to this grift. These things are making people so dumb and a few richer, thats it.

cindyllm|4 months ago

[deleted]

kanjun|4 months ago

Hey, Kanjun from Imbue here! This is exactly why we built Sculptor (https://imbue.com/sculptor), a desktop UI for Claude Code.

Each agent has its own isolated container. With Pairing Mode, you can sync the agent's code and git state directly into your local Cursor/any IDE so you can instantly validate its work. The sync is bidirectional so your local changes flow back to the agent in realtime.

Happy to answer any questions - I think you'll really like the tight feedback loop :)