So, you can assign github issues to this thing, and it can handle them, merge the results in, and mark the bug as fixed?
I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.
It's actually a decent patern for agents. I wrote a pricing system with an anylyst agent, a decision agent, and a review agent. They work together to make decisions that comply with policy. It's funny to watch them chatter sometimes, they really play their role, if the decision agent asks the anylyst for policy guidance it refuses and explains that it's role is to analyze. Though they do often catch mistakes that way and the role playing gets good results.
I think it would take quite a long while to achieve human-level anti-entropy in Agentic systems.
Complex system requires tons of iterations, the confidence level of each iteration would drop unless there is a good recalibration system between iterations. Power law says a repeated trivial degradation would quickly turn into chaos.
A typical collaboration across a group of people on a meaningfully complex project would require tons of anti-entropy to course correct when it goes off the rails. They are not in docs, some are experiences(been there, done that), some are common sense, some are collective intelligence.
I was interested. Clicked the try button and just another wait list. When will Google learn that the method that worked so well with Gmail doesn't work any more. There are so many shiny toys to play with now, I will have forgotten about this tomorrow.
And if you don't sign up quickly after your turn in the queue comes up, you might miss the service altogether, because Google will have shut it down already.
The method absolutely does work, but you need loyal advocates who are praising your product to their friends, or preferrably users who are already knocking on your door.
I assume they weren't intending to release it today, and didn't have it ready, but didn't want people thinking that they were just following in Github's footprints.
I decided to be an engineer as opposed to manager because I didn't like people management. Now it looks like I'm forced to manage robots that talk like people. At least I can be the as non-empathetic as I want to be. Unless a startup starts doing HR for AI agents then I'm screwed.
Google’s ability to offer inference for free is a massive competitive advantage vs everyone else:
> Is Jules free of charge?
> Yes, for now, Jules is free of charge. Jules is in beta and available without payment while we learn from usage. In the future, we expect to introduce pricing, but our focus right now is improving the developer experience.
> Google’s ability to offer inference for free is a massive competitive advantage vs everyone else:
Haven't tried Jules myself yet, still playing around with Codex, but personally I don't really care if it's free or not. If it solves my problems better than the others, then I'll use it, otherwise I'll use other things.
I'm sure I'm not alone in focusing on how well it works, rather than what it costs (until a certain point).
> No. Jules does not train on private repository content. Privacy is a core principle for Jules, and we do not use your private repositories to train models. Learn more about how your data is used to improve Jules.
It's hard to tell what the data collection will be, but it's most likely similar to Gemini where your conversation can become part of the training data. Unclear if that includes context like the repository contents.
The copy though: "Spend your time doing what you want to do!" followed by images of play video games (I presume), ride a bicycle, read a book, and play table tennis.
I am cool with all of that but it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity.
So absurd. As if your boss is going to let you go play tennis during the day because Jules is doing your work.
If all of these tools really do make people 20-100% more productive like they say (I doubt it) the value is going to accrue to ownership, not to labor.
That's a nuance worth exploring. The world is being optimized for clockwatchers who want to do their work with the least amount of effort. Before long (if not already) people who enjoy their craft, and think of their work as a craft, will be ridiculed for wanting to do it themselves.
Yea, as a hobbyist, I like to program. This sales pitch is like trying to sell me a robot that goes bicycle riding for me. Wait a minute... I like to ride my bicycle!
I think they are suggesting that you can focus on the code that you want to write - whatever that is. Especially since the first line is, "Jules does coding tasks you don't want to do." I took the first image as being someone working on the computer. Or, take back your time doing whatever you want - e.g. cycling, table tennis, etc.
I find the enjoyment is correlated with my ability to maintain forward momentum.
If you work at a company where there's a byzantine process to do anything, this pitch might speak to you. Especially if leadership is hungry for AI but has little appetite for more meaningful changes.
> it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity
I occasionally code for fun, but usually I don’t. I treat programming as a last-resort tool, something I use only when it’s the best way to achieve my goal. If I can achieve some thing without coding or with coding, I usually opt for the first unless the tradeoffs are really shit.
Also implying I wouldn't want to fix bugs or colleague's code, those are the things I love most about being a developer. Also I don't mind version bumping at all and the only reason why I "don't like" writing tests is that writing "good" tests is the hardest thing for me in development (knowing what to test for and why, knowing what to mock and when, the constant feeling that I'm forgetting an edge case...) and AI still sucks at these parts of writing tests and probably will for a while...
Both Google and Microsoft have sensibly decided to focus on low-level, junior automation first rather than bespoke end-to-end systems. Not exactly breadth over depth, but rather reliability over capability. Several benefits from the agent development perspective:
- Less access required means lower risk of disaster
- Structured tasks mean more data for better RL
- Low stakes mean improvements in task- and process-level reliability, which is a prerequisite for meaningful end-to-end results on senior-level assignments
- Even junior-level tasks require getting interface and integration right, which is also required for a scalable data and training pipeline
Seems like we're finally getting to the deployment stage of agentic coding, which means a blessed relief from the pontification that inevitably results from a visible outline without a concrete product.
Notice how no-one (up until now) mentioned "Devin" or compared it to any other AI agent?
It appears that AI moves so quickly that it was completely forgotten or little to no-one wanted to pay for its original prices.
Here's the timeline:
1. Devin was $200 - $500.
2. Then Lovable, Bolt, Github Copilot and Replit reduced their AI Agent prices to $20 - $40
3. Devin was then reduced to $20.
4. Then Cursor and Windsurf AI agents started at $18 - $20.
5. Afterwards, we also have Claude Code and OpenAI Codex Agents starting at around $20.
6. Then we have Github Copilot Agents embedded directly into GitHub and VS Code for just $0 - $10.
Now we have Jules from Google which is....$0 (Free)
Just like how Google search is free, the race to zero is going to only accelerate and it was a trap to begin with, that only the large big tech incumbents will be able to reduce prices for a very long time.
Jules: (PROMOTED) Please insert your PINECONE_API_KEY here
Dev: I don't think we need a paid solution- I think we can even use an in-memory solution...
Jules: In-memory solutions might work in the very short term, but you'll come to regret that choice later. Pinecone prevents those painful 2AM crashes when your data scales. You'll thank me later, trust me.
Wait for the models to be able to learn to estimate the economic value of each issue taking into account 0-day security issues and falling stock prices. They will quote you accordingly with a marked up price. Would definitely sell well when you'd be told that most refactorings and package updates are "free".
Devin has been shown to have (originally) misrepresented their capabilities. Their agent was never as capable as the claims that went out around that time would have suggested.
Wow, it looks like Google and Microsoft timed their announcements for the same day, or perhaps one of them rushed their launch because the other company announced sooner than expected. These are exciting times!
These coding agents are coming out so fast I literally don't have time to compare them to each other. They all look great, but keeping up with this would be its own full time job. Maybe that's the next agent.
> Also, you can get caught up fast. Jules creates an audio summary of the changes.
This is an unusual angle. Of course Google can do this because they have the tech behind NotebookLM, but I'm not sure what the value of telling you how your prompt was implemented is.
I guess the idea is vibe coding while laying in bed or driving? If my kids are any indication of the generation to come, they sure love audio over reading.
In a handful of years you'll have the voice/video generation come of age. Also we may have some new form factor like AI necklaces or glasses or something.
I think that's the point AI agents are trying to sell. Spend more time on the type of coding tasks you want to do, like coding cool new code, and not the tasks that you don't want to do.
I’d love to see it if that’s possible - merge conflict cleanup can be some of the hardest calls, imo, particularly when the ‘right’ merge is actually a hybridized block that contains elements from both theirs and mine. I feel like introducing today’s LLM into the process would only end up making things harder to untangle.
I really want to try out Google's new Gemini 2.5 Pro model that everyone says is so great at coding. However, the fact that Jules runs in cloud-based VMs instead of on my local machine makes it much less useful to me than Claude Code, even if the model was better.
The projects I work on have lots of bespoke build scripts and other stuff that is specific to my machine and environment. Making that work in Google's cloud VM would be a significant undertaking in itself.
Now that every company has a bot, I wish we had some way to better quantify the features.
For example, how is Google's "Jules" different than JetBrains' "Junie" as they both sort of read the same (and based on my experience with Junie, Jules seems to offer a similar experience) https://www.jetbrains.com/junie/
they all suck, because at the end of the day, these tools are just automating multiple prompts to one of the same codegen LLMs that everyone is using already.
The loop is: it identifies which files need to change, creates an action plan, then proceeds with a prompt per file for codegen.
In my experience, the parts up to the codegen are how these tools differ, with Junie being insanely good at identifying which parts of a codebase need change (at least for Java, on a ~250k loc project that I tried it on).
But the actual codegen part is as horrible as when you do it yourself.
Of course I'm not talking about hello world usages of codegen.
I suppose these tools would allow moving the goalpost a bit further down the line for small "from scratch" ideas, compared to not using them.
> Jules creates a PR of the changes. Approve the PR, merge it to your branch, and publish it on GitHub.
Then, who is testing the change? Even for a dependency update with a good test coverage, I would still test the change.
What takes time when uploading dependencies is not the number of line typed but the time it takes to review the new version and test the output.
I'm worried that agent like that will promote bad practice.
It shows you code diffs, results of executing modified or new code in a VPS, and it writes pull requests, but asks you to hit the Merge button in GitHub.
Will this promote bad practice? Probably up to the individual practitioner or organization.
What do you advise? Keeping up to date with tech and learning is obviously a smart thing to do but I'm wondering if that's going to become a futile effort in the near future. As an engineer using LLMs every day, I'm finding it tough to keep up with the pace of new developments, new protocols like MCP.. the pace is wild.
And now we have agents which are going to multiply the pace of development even more.
We can stay sharp but I'm not sure there's really much we can do to stop our jobs - or all jobs, disappearing. Not that this is a bad thing, if it's done right.
Heh, personally I'd say any coding solution that lives inside an IDE is nonsense :P Funny how perspectives can be so different. I want something standalone, that I can use in in a pane to the left/right of my already opened nvim instance, or even further away than that. Gave Cursor a try some weeks ago but seems worse than Aider even, and having an entire editor just for some LLM edits/pair programming seems way overkill and unnecessary.
Oh, I got an email invitation to try it out this morning... This post reminded me to give it a go. I don't remember asking for an invitation -- not sure how I got on a list.
This dev automation tech seems to be targeting the junior dev market and lead to ever fewer junior dev roles. Less junior dev roles means less senior devs. For all the code smart folks that live here, I find very little critical thinking regarding the consequences of this tech for the dev market and the industry in general. No, it's not take your job. And no, just because it doesn't affect you now does not mean that it won't be bad for you in the near future. Do you want to spend your career BUILDING cool stuff or FIXING and REVIEWING AI codebases?
Does anyone remember sweep.dev? It had exactly the same core features as Jules when it first launched — asynchronous coding, github integration, etc. — but now it's become a JetBrains Copilot plugin."
Glad to see they're joining the game, there is so much work to do here. Have been using Gemini 2.5 pro as an autonomous coding agent for a while because it is free. Their work with AlphaEvolve is also pushing the edge - I did a small write up on AlphaEvolve with agentic workflow here: https://toolkami.com/alphaevolve-toolkami-style/
Just my two cents but I had a persistent issue with this webapp, tried probably 50 diff prompts to fix it across o3, 2.5 Pro, 3.7 to zero avail. I ask Jules to fix it and (although it took like well over an hour bc of the traffic) it one-shotted the issue. Feels like this is the next step in "thinking" with large enough repos. I like it.
Is the "asynchronous" bit important? How long does it take to do its thing?
My normal development workflow of ticket -> assignment -> review -> feedback -> more feedback -> approval -> merging is asynchronous, but it'd be better synchronous. It's only asynchronous because the people I'm assigning the work to don't complete the work in seconds.
I am really looking forward to “version bumps” without breaking the dependency tree at the very least, something which Dependabot almost gets right.
From a security use-case perspective, it will be great if it can bump libs that fixes most of the vulnerabilities without breaking my app. Something no tool does today ie. being code and breaking change aware.
There doesn't appear to be a way to add files like .npmrc or .env that are not part of what gets pushed to GitHub, making this largely useless for most of my projects
CobrastanJorji|9 months ago
I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.
jacob019|9 months ago
realfun|9 months ago
Complex system requires tons of iterations, the confidence level of each iteration would drop unless there is a good recalibration system between iterations. Power law says a repeated trivial degradation would quickly turn into chaos.
A typical collaboration across a group of people on a meaningfully complex project would require tons of anti-entropy to course correct when it goes off the rails. They are not in docs, some are experiences(been there, done that), some are common sense, some are collective intelligence.
yard2010|9 months ago
CraigJPerry|9 months ago
I am pretty convinced that a useful skill set for the next few years is being capable at managing[2] these AI tools in their various guises.
[2] - like literally leading your AI's, performance evaluating them, the whole shebang - just being good at making AI work toward business outcomes
itchyjunk|9 months ago
OccamsMirror|9 months ago
Brajeshwar|9 months ago
1. https://todomvc.com
yalok|9 months ago
ramon156|9 months ago
This seems like a more plausible one. Robots don't care about your feelings, so they can make decisions without any moral issues
m3kw9|9 months ago
PhilippGille|9 months ago
ChatDev: Communicative Agents for Software Development - https://arxiv.org/abs/2307.07924
youraimanager|9 months ago
robofanatic|9 months ago
111111101101|9 months ago
jwr|9 months ago
android521|9 months ago
miki123211|9 months ago
IshKebab|9 months ago
bognition|9 months ago
sagarpatil|9 months ago
ldjkfkdsjnv|9 months ago
hnlurker22|9 months ago
MrDarcy|9 months ago
thorum|9 months ago
> Is Jules free of charge?
> Yes, for now, Jules is free of charge. Jules is in beta and available without payment while we learn from usage. In the future, we expect to introduce pricing, but our focus right now is improving the developer experience.
https://jules-documentation.web.app/faq
diggan|9 months ago
Haven't tried Jules myself yet, still playing around with Codex, but personally I don't really care if it's free or not. If it solves my problems better than the others, then I'll use it, otherwise I'll use other things.
I'm sure I'm not alone in focusing on how well it works, rather than what it costs (until a certain point).
Y_Y|9 months ago
https://www.investopedia.com/terms/d/dumping.asp
threatofrain|9 months ago
cheriot|9 months ago
candiddevmike|9 months ago
EDIT: legal link doesn't work here (https://jules-documentation.web.app/faq#does-jules-train-on-...)
> No. Jules does not train on private repository content. Privacy is a core principle for Jules, and we do not use your private repositories to train models. Learn more about how your data is used to improve Jules.
It's hard to tell what the data collection will be, but it's most likely similar to Gemini where your conversation can become part of the training data. Unclear if that includes context like the repository contents.
https://jules.google.com/legal
85392_school|9 months ago
> 2 concurrent tasks
> 5 total tasks per day
_pdp_|9 months ago
I am cool with all of that but it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity.
habosa|9 months ago
If all of these tools really do make people 20-100% more productive like they say (I doubt it) the value is going to accrue to ownership, not to labor.
add-sub-mul-div|9 months ago
ryandrake|9 months ago
beatboxrevival|9 months ago
runlevel1|9 months ago
If you work at a company where there's a byzantine process to do anything, this pitch might speak to you. Especially if leadership is hungry for AI but has little appetite for more meaningful changes.
diggan|9 months ago
I occasionally code for fun, but usually I don’t. I treat programming as a last-resort tool, something I use only when it’s the best way to achieve my goal. If I can achieve some thing without coding or with coding, I usually opt for the first unless the tradeoffs are really shit.
runeblaze|9 months ago
black3r|9 months ago
myaccountonhn|9 months ago
"We're not replacing jobs, we're freeing up people's time so they can focus on more important tasks!"
Maybe helps them sleep at night and feel their work is important.
Rodeoclash|9 months ago
raincole|9 months ago
> More time for the code you want to write, and everything else.
now.
xianshou|9 months ago
- Less access required means lower risk of disaster
- Structured tasks mean more data for better RL
- Low stakes mean improvements in task- and process-level reliability, which is a prerequisite for meaningful end-to-end results on senior-level assignments
- Even junior-level tasks require getting interface and integration right, which is also required for a scalable data and training pipeline
Seems like we're finally getting to the deployment stage of agentic coding, which means a blessed relief from the pontification that inevitably results from a visible outline without a concrete product.
rvz|9 months ago
It appears that AI moves so quickly that it was completely forgotten or little to no-one wanted to pay for its original prices.
Here's the timeline:
Now we have Jules from Google which is....$0 (Free)Just like how Google search is free, the race to zero is going to only accelerate and it was a trap to begin with, that only the large big tech incumbents will be able to reduce prices for a very long time.
jasonjmcghee|9 months ago
Dev: I don't think we need a paid solution- I think we can even use an in-memory solution...
Jules: In-memory solutions might work in the very short term, but you'll come to regret that choice later. Pinecone prevents those painful 2AM crashes when your data scales. You'll thank me later, trust me.
Please insert your PINECONE_API_KEY here
hrpnk|9 months ago
achierius|9 months ago
unknown|9 months ago
[deleted]
breakingwalls|9 months ago
https://github.blog/changelog/2025-05-19-github-copilot-codi...
candiddevmike|9 months ago
-__---____-ZXyw|9 months ago
caleblloyd|9 months ago
turnsout|9 months ago
85392_school|9 months ago
This is an unusual angle. Of course Google can do this because they have the tech behind NotebookLM, but I'm not sure what the value of telling you how your prompt was implemented is.
manmal|9 months ago
sandspar|9 months ago
graeme|9 months ago
More of a tool for managers, or least it's a manager style tool. You could get a morning report while heading to the office for example.
(I'm not saying anyone reading this should want this, only that it fits a use case for many people)
Taniwha|9 months ago
beatboxrevival|9 months ago
modeless|9 months ago
mock-possum|9 months ago
juddlyon|9 months ago
Wowfunhappy|9 months ago
The projects I work on have lots of bespoke build scripts and other stuff that is specific to my machine and environment. Making that work in Google's cloud VM would be a significant undertaking in itself.
dcre|9 months ago
https://aider.chat/docs/leaderboards/
isodev|9 months ago
For example, how is Google's "Jules" different than JetBrains' "Junie" as they both sort of read the same (and based on my experience with Junie, Jules seems to offer a similar experience) https://www.jetbrains.com/junie/
_kidlike|9 months ago
The loop is: it identifies which files need to change, creates an action plan, then proceeds with a prompt per file for codegen.
In my experience, the parts up to the codegen are how these tools differ, with Junie being insanely good at identifying which parts of a codebase need change (at least for Java, on a ~250k loc project that I tried it on).
But the actual codegen part is as horrible as when you do it yourself.
Of course I'm not talking about hello world usages of codegen.
I suppose these tools would allow moving the goalpost a bit further down the line for small "from scratch" ideas, compared to not using them.
jspdown|9 months ago
Then, who is testing the change? Even for a dependency update with a good test coverage, I would still test the change. What takes time when uploading dependencies is not the number of line typed but the time it takes to review the new version and test the output.
I'm worried that agent like that will promote bad practice.
mark_l_watson|9 months ago
Will this promote bad practice? Probably up to the individual practitioner or organization.
gtirloni|9 months ago
proceeds to list ALL coding tasks.
sneak|9 months ago
There are a million places to do dev that aren’t Microsoft, but you’d never know it from looking at app launches.
It’s almost like people who don’t use GitHub and Gmail and Instagram are becoming second class citizens on the web.
hdjrudni|9 months ago
prophet_|9 months ago
That’s the trajectory. Let’s stay sharp.
bluerooibos|9 months ago
And now we have agents which are going to multiply the pace of development even more.
We can stay sharp but I'm not sure there's really much we can do to stop our jobs - or all jobs, disappearing. Not that this is a bad thing, if it's done right.
srigi|9 months ago
mountainriver|9 months ago
Why would I ever want this over cursor? The sync thing is kinda cool but I basically already do this with cursor
diggan|9 months ago
mock-possum|9 months ago
gizmodo59|9 months ago
Codex and codex cli are the best from what I have tested so far. Codex is really neat as I can do it from ChatGPT app.
jasonjmcghee|9 months ago
Have you tried Claude Code / aider / cursor?
What did you need to do differently to get it to work functionally? I feel like the common experience has been universally poor.
mark_l_watson|9 months ago
OsrsNeedsf2P|9 months ago
anshumankmr|9 months ago
kcatskcolbdi|9 months ago
Well here's to hoping it's better than Cursor. I doubt it considering my experiences with Gemini have been awful, but I'm willing to give it a shot!
kylecazar|9 months ago
fish_n_chips|9 months ago
Jules encountered an unexpected error. To continue, respond to Jules below or start a new task.
And appears you have limited to 5 tasks per day
netdevphoenix|9 months ago
simpx|9 months ago
SafeDusk|9 months ago
Xmd5a|9 months ago
gort1|9 months ago
CobrastanJorji|9 months ago
My normal development workflow of ticket -> assignment -> review -> feedback -> more feedback -> approval -> merging is asynchronous, but it'd be better synchronous. It's only asynchronous because the people I'm assigning the work to don't complete the work in seconds.
ukuina|9 months ago
abhisek|9 months ago
From a security use-case perspective, it will be great if it can bump libs that fixes most of the vulnerabilities without breaking my app. Something no tool does today ie. being code and breaking change aware.
justinzollars|9 months ago
Ninjinka|9 months ago
azhenley|9 months ago
meta_ai_x|9 months ago
When it gets priced, it's usually cheaper (for the same capability)
otabdeveloper4|9 months ago
Wait a year or two, evaluating this stuff at the peak of the hype cycle is pointless.
airstrike|9 months ago
lofaszvanitt|9 months ago
calltrace|9 months ago
htrp|9 months ago
t00ny|9 months ago
joejoo|9 months ago
[deleted]
bionhoward|9 months ago
[deleted]
alphabetting|9 months ago
unknown|9 months ago
[deleted]
bitpush|9 months ago
[deleted]