After outages, Amazon to make senior engineers sign off on AI-assisted changes
659 points| ndr42 | 23 days ago |arstechnica.com | reply
https://twitter.com/lukolejnik/status/2031257644724342957 (https://xcancel.com/lukolejnik/status/2031257644724342957)
[+] [-] cobolcomesback|23 days ago|reply
This meeting happens literally every week, and has for years. Feels like the media is making a mountain out of a mole hill here.
[+] [-] davidclark|23 days ago|reply
>He asked staff to attend the meeting, which is normally optional.
Is that false? It also discusses a new policy:
>Junior and mid-level engineers will now require more senior engineers to sign off any AI-assisted changes, Treadwell added.
Is that inaccurate? It is good context that this is a regularly scheduled meeting. But, regularly scheduled meetings can have newsworthy things happen at them.
[+] [-] CoolGuySteve|23 days ago|reply
Items weren't displaying prices and it was impossible to add anything to your cart. It lasted from about 2pm to 5pm.
It's especially strange because if a computer glitch brought down a large retail competitor like Walmart I probably would have seen something even though their sales volume is lower.
[+] [-] belval|23 days ago|reply
[+] [-] otterley|23 days ago|reply
That's been their job ever since cable news was invented.
[+] [-] groundzeros2015|22 days ago|reply
[+] [-] embedding-shape|23 days ago|reply
Are you completely missing the point of the submission? It's not about "Amazon has a mandatory weekly meeting" but about the contents of that specific meeting, about AI-assisted tooling leading to "trends of incidents", having a "large blast radius" and "best practices and safeguards are not yet fully established".
No one cares how often the meeting in general is held, or if it's mandatory or not.
[+] [-] furyofantares|23 days ago|reply
Must have as the comments are hours older than OP.
[+] [-] unknown|23 days ago|reply
[deleted]
[+] [-] cmiles8|23 days ago|reply
[+] [-] rahbert|22 days ago|reply
[+] [-] age1mlclg6|23 days ago|reply
https://www.theguardian.com/us-news/ng-interactive/2026/jan/...
[+] [-] Clent|23 days ago|reply
What is worth being pointed out is how quickly people blame "The Media" for how people use, consume and spread information on social networks.
[+] [-] niwtsol|23 days ago|reply
[+] [-] happytoexplain|23 days ago|reply
Review by a senior is one of the biggest "silver bullet" illusions managers suffer from. For a person (senior or otherwise) to examine code or configuration with the granularity required to verify that it even approximates the result of their own level of experience, even only in terms of security/stability/correctness, requires an amount of time approaching the time spent if they had just done it themselves.
I.e. senior review is valuable, but it does not make bad code good.
This is one major facet of probably the single biggest problem of the last couple decades in system management: The misunderstanding by management that making something idiot proof means you can now hire idiots (not intended as an insult, just using the terminology of the phrase "idiot proof").
[+] [-] ardeaver|23 days ago|reply
The more expensive and less sexy option is to actually make testing easier (both programmatically and manually), write more tests and more levels of tests, and spend time reducing code complexity. The problem, I think, is people don't get promoted for preventing issues.
[+] [-] marginalia_nu|23 days ago|reply
Unchecked, AI models output code that is as buggy as it is inefficient. In smaller green field contexts, it's not so bad, but in a large code base, it's performs much worse as it will not have access to the bigger picture.
In my experience, you should be spending something like 5-15X the time the model takes to implement a feature on reviewing and making it fix its errors and inefficiencies. If you do that (with an expert's eye), the changes will usually have a high quality and will be correct and good.
If you do not do that due dilligence, the model will produce a staggering amount of low quality code, at a rate that is probably something like 100x what a human could output in a similar timespan. Unchecked, it's like having a small army of the most eager junior devs you can find going completely fucking ape in the codebase.
[+] [-] js8|23 days ago|reply
It's actually often harder to fix something sloppy than to write it from scratch. To fix it, you need to hold in your head both the original, the new solution, and calculate the difference, which can be very confusing. The original solution can also anchor your thinking to some approach to the problem, which you wouldn't have if you solve it from scratch.
[+] [-] unshavedyak|23 days ago|reply
Hell, often it feels slower/worse. Foreign code is easily confusing at first, which slows you down - and bad code quickly gets bewildering and sends you down paths of clarifications that waste time.
[+] [-] steveBK123|23 days ago|reply
If AI is a productivity boost and juniors are going to generate 10x the PRs, do you need 10x the seniors (expensive) or 1/10th the juniors (cost save).
A reminder that in many situations, pure code velocity was never the limiting factor.
Re: idiot prooofing I think this is a natural evolution as companies get larger they try to limit their downside & manage for the median rather than having a growth mindset in hiring/firing/performance.
[+] [-] AgentOrange1234|23 days ago|reply
[+] [-] jetrink|23 days ago|reply
Maybe I don't have the correct mental model for how the typical junior engineer thinks though. I never wanted to bug senior people and make demands on their time if I could help it.
[+] [-] onion2k|23 days ago|reply
I suspect that isn't the goal.
Review by more senior people shifts accountability from the Junior to a Senior, and reframes the problem from "Oh dear, the junior broke everything because they didn't know any better" to "Ah, that Senior is underperforming because they approved code that broke everything."
[+] [-] bs7280|23 days ago|reply
Whether or not these productivity gains are realized is another question, but spreadsheet based decision makers are going to try.
[+] [-] hintymad|23 days ago|reply
Especially in a big co like Amazon, most senior engineers are box drawers, meeting goers, gatekeepers, vision setters, org lubricants, VP's trustees, glorified product managers, and etc. They don't necessarily know more context than the more junior engineers, and they most likely will review slowly while uncovering fewer issues.
[+] [-] raw_anon_1111|23 days ago|reply
I’m probably not going to review a random website built by someone except for usability, requirements and security.
[+] [-] belval|23 days ago|reply
[+] [-] qnleigh|23 days ago|reply
1. They can assess whether the use of AI is appropriate without looking in detail. E.g. if the AI changed 1000 lines of code to fix a minor bug, or changed code that is essential for security.
2. To discourage AI use, because of the added friction.
[+] [-] zamalek|23 days ago|reply
My manager has been urging us to truly vibe code, just yesterday saying that "language is irrelevant because we've reached the point where it works - so you don't need to see it." This article is a godsend; I'll take this flawed silver bullet any day of the week.
[+] [-] mrothroc|23 days ago|reply
The other problem is that the type of errors LLMs make are different than juniors. There are huge sections of genuinely good code. So the senior gets "review fatigue" because so much looks good they just start rubber stamping.
I use an automated pipeline to generate code (including terraform, risking infrastructure nukes), and I am the senior reviewer. But I have gates that do a whole range of checks, both deterministic and stochastic, before it ever gets to me. Easy things are pushed back to the LLM for it to autofix. I only see things where my eyes can actually make a difference.
Amazon's instinct is right (add a gate), but the implementation is wrong (make it human). Automated checks first, humans for what's left.
[+] [-] yifanl|23 days ago|reply
[+] [-] grvdrm|23 days ago|reply
I hear “x tool doesn’t really work well” and then I immediately ask: “does someone know how to use it well?” The answer “yes” is infrequent. Even a yes is often a maybe.
The problem is pervasive in my world (insurance). Number-producing features need to work in a UX and product sense but also produce the right numbers, and within range of expectations. Just checking the UX does what it’s supposed to do is one job, and checking the numbers an entirely separate task.
I don’t many folks that do both well.
[+] [-] 33MHz-i486|22 days ago|reply
Also while this is happening most developers are getting constantly hammered by operational issues and critical security tasks because 1) the legacy toolchain imports 6 different language package ecosystems and 2)no one ever pays down tech debt in legacy code until its a high severity ticket count in a KPI dashboard visible to the senior management.
[+] [-] prakhar897|23 days ago|reply
1. Shipping: deliver tickets or be pipped.
2. Having Less comments on their PRs: for some drastically dumb reason, having a PR thoroughly reviewed is a sign of bad quality. L7 and above use this metric to Pip folks.
3. Docs: write docs, get them reviewed to show you're high level.
Without AI, an employee is worse off in all of the above compared to folks who will cheat to get ahead.
I can't see how "requesting" folks for forego their own self-preservation will work. especially when you've spent years pitting people against each other.
[+] [-] sdevonoes|23 days ago|reply
There’s also this implicit imbalance engineers typically don’t like: it takes me 10 min to submit a complete feature thanks to Claude… but for the human reviewing my PR in a manual way it will take them 10-20 times that.
Edit: at the end real engineers know that what takes effort is a) to know what to build and why, b) to verify that what was built is correct. Currently AI doesn’t help much with any of these 2 points.
The inbetweens are needed but they are a byproduct. Senior leadership doesn’t know this, though.
[+] [-] cmiles8|23 days ago|reply
News from the inside makes it sound like things are getting pretty bad.
[+] [-] philip1209|23 days ago|reply
People push AI-reviewed code like they wrote it. In the past, "wrote it" implies "reviewed it." With AI, that's no longer true.
I advocate for GitHub and other code review systems to add a "Require self-review" option, where people must attest that they reviewed and approved their own code. This change might seem symbolic, but it clearly sets workflows and expectations.
[+] [-] ritlo|23 days ago|reply
They're torn between "we want to fire 80% of you" and "... but if we don't give up quality/reliability, LLMs only save a little time, not a ton, so we can only fire like 5% of you max".
(It's the same in writing, these things are only a huge speed-up if it's OK for the output to be low-quality, but good output using LLMs only saves a little time versus writing entirely by-hand—so far, anyway, of course these systems are changing by the day, but this specific limitation has remained true for about four years now, without much improvement)
[+] [-] petterroea|22 days ago|reply
Mentoring Juniors is an important part of the job and crucial service to the industry, but juniors equipped with LLMs make the deal a bit more sour. Anecdotally, they don't really remember the feedback as well, because they weren't involved in writing the code. Its burnout-inducing to see your hard work and feedback go in one ear and out another.
I personally know people looking to jump ship because they waste too much time at their current employer on this.
[+] [-] lokar|23 days ago|reply
Code review should not be (primarily) about catching serious errors. If there are always a lot of errors, you can’t catch most of them with review. If there are few it’s not the best use of time.
The goal is to ensure the team is in sync on design, standards, etc. To train and educate Jr engineers, to spread understanding of the system. To bring more points of view to complex and important decisions.
These goals help you reduce the number of errors going into the review process, this should be the actual goal.
[+] [-] 827a|22 days ago|reply
We love this for Amazon, they're a very strong company making bold decisions.
[+] [-] znpy|23 days ago|reply
First thing that comes to mind is: reminds me of those movie where some dictatorship starts to crumble and the dictator start being tougher and tougher on generals, not realizing the whole endeavor is doomed, not just the current implementation.
Then again, as a former amazon (aws) engineer: this is just not going to work. Depending how you define "senior engineer" (L5? L6? L7?) this is less and less feasible.
L5 engineers are already supposed to work pretty much autonomously, maybe with L6 sign-off when changes are a bit large in scope.
L6 engineers already have their own load of work, and a fairly large amount of engineers "under" them (anywhere from 5 to 8). Properly reviewing changes from all them, and taking responsibility for that, is going to be very taxing on such people.
L7 engineers work across teams and they might have anywhere from 12 to 30 engineers (L4/5/6) "under" them (or more). They are already scarce in number and they already pretty much mostly do reviews (which is proving not sufficient, it seems). Mandating sign-off and mandating assumption of responsibility for breaking changes means these people basically only do reviews and will be stricter and stricter[1] with engineers under them.
L8 engineers, they barely do any engineering at all, from what I remember. They mostly review design documents, in my experience not always expressing sound opinions or having proper understanding of the issues being handled.
In all this, considering the low morale (layoffs), the reduced headcount (layoffs) and the rise in expectations (engineers trying harder to stay afloat[2] due to... layoffs)... It's a dire situation.
I'm going to tell you, this stinks A LOT like rotting day 2 mindset.
----
1. keep in mind you can't, in general, determine the absence of bugs
2. Also cranking out WAY MUCH MORE code due to having gen-ai tools at their fingertips...
[+] [-] paxys|23 days ago|reply
[+] [-] Lalabadie|23 days ago|reply
[+] [-] sethops1|23 days ago|reply
So basically, kill the productivity of senior engineers, kill the ability for junior engineers to learn anything, and ensure those senior engineers hate their jobs.
Bold move, we'll see how that goes.
[+] [-] rglover|23 days ago|reply
[+] [-] ndr42|23 days ago|reply
[+] [-] AlotOfReading|23 days ago|reply
I wonder if it's an early step towards an apprenticeship system.
[+] [-] mentos|23 days ago|reply
Feels inevitable that code for aviation will slowly rot from the same forces at play but with lethal results.
[+] [-] AlexeyBrin|23 days ago|reply