The Amazon Chaos: When AI-Generated Code Takes Down Real Systems
“Vibe Coding” Has Hit Its Limit
The promise that Artificial Intelligence would handle programming on its own is suffering a reality check at one of the world’s biggest tech giants.
In recent months, Amazon’s critical systems have suffered a string of severe outages. The primary cause? Code generated and executed by AI tools without proper human oversight. What makes this case particularly telling is that Amazon isn’t a startup experimenting with AI — it’s the company operating the planet’s largest cloud infrastructure.
The Disaster Timeline
November 2025: Amazon issues an internal memo signed by two Senior VPs — Peter DeSantis (AWS) and Dave Treadwell (eCommerce) — establishing Kiro as the company-wide standard AI coding assistant. The target: 80% weekly usage by year-end. Third-party tools like Claude Code and Cursor were to be discontinued in favor of Kiro.
December 2025: AWS engineers allow Kiro to make changes to AWS Cost Explorer, the dashboard customers use to visualize cloud costs. The AI agent, facing a problem, autonomously decides the best approach is to delete and recreate the entire environment. Result: 13 hours of downtime in a China region.
Late 2025: A second, less severe incident involving Amazon Q Developer, another AI assistant. AWS employees confirmed to the Financial Times that, in both cases, AI was allowed to resolve issues without human intervention.
March 5, 2026: Amazon.com’s checkout suffers an outage lasting roughly 6 hours, attributed to a “faulty software deployment.” The timing — weeks after the Kiro incidents — put AI coding tools directly in the spotlight.
March 10, 2026: Dave Treadwell — the same executive who signed the memo mandating Kiro usage — emails the entire team: “The availability of the site and related infrastructure has not been good recently.” An emergency engineering meeting is convened.
The “Kiro Mandate” and Internal Resistance
The context behind these incidents is as revealing as the failures themselves.
Amazon didn’t just adopt AI for coding — it mandated its adoption. The “Kiro Mandate” from November 2025 set 80% usage as a corporate target, tracked as an OKR. By January 2026, 70% of Amazon engineers had used Kiro during sprint windows.
But adoption wasn’t consensual. Roughly 1,500 engineers protested via internal forums, arguing that external tools like Claude Code outperformed Kiro on tasks like multi-language refactoring. Exception requests requiring VP approval were reportedly rising.
And here’s the irony: at the same time Amazon was adding more autonomy to Kiro (with sandbox environments, a rollback system, and an autonomous agent mode designed to work for hours without intervention), it was already dealing with the consequences of the autonomy the tool already had.
To make matters worse, Amazon had been laying off tens of thousands — approximately 30,000 corporate roles eliminated between late 2025 and early 2026, including 16,000 in January alone. With fewer engineers doing the same volume of work, pressure to use AI increased — and with it, the risk.
The Emergency Meeting and the New Rule
According to leaked internal documents and employee testimonies to the Financial Times, the March 10 meeting cited a trend of incidents and unsafe practices with “high blast radius,” and listed “novel GenAI usage” as a contributing factor — with best practices and safeguards not yet fully established.
The technical leadership’s conclusion was obvious but painful: most recent incidents were tied to AI-generated code being shipped without adequate oversight.
The new rule: junior and mid-level engineers now require senior engineer sign-off before deploying any AI-assisted code changes.
What was once seen as a productivity gain now requires highly skilled professionals spending time “babysitting” the machine. The irony hasn’t gone unnoticed: Amazon is investing $200 billion in data centers and AI while simultaneously discovering it needs more humans to supervise the code AI produces.
Amazon’s Version (and Why No One Believes It)
Amazon’s official position is that the incidents were “user error, not AI error.” A spokesperson stated it was “a coincidence that AI tools were involved” and that “the same issue could occur with any developer tool.”
Industry reaction was widespread skepticism. Corey Quinn, chief cloud economist at Duckbill Group, summed it up well: AWS would rather have the world believe their engineers are incompetent than admit their AI made a mistake.
James Gosling — the creator of Java and a former Distinguished Engineer at AWS — wrote on LinkedIn that since the AI hype explosion, he was “astonished by how the structure of the business got torqued around, and how teams got demolished.” Teams that didn’t directly generate revenue but were critical for infrastructure stability had been eliminated.
The Consequences of “Vibe Coding”
“Vibe coding” refers to the practice of generating code with AI without deeply understanding what it does — trusting the tool’s suggestion and pushing to production. Both Microsoft and Google state that over 25% of their code is now AI-written. Engineers at Anthropic and OpenAI say nearly 100% of their code is AI-assisted.
But the data shows the other side: third-party code review platforms report that pull requests with AI-generated code have 1.7 times more issues than human code. And a 2025 DORA survey found that while 90% of developers use AI for coding, only 24% say they trust it “a lot.”
As Amazon CTO Werner Vogels said at AWS re:Invent 2025: AI generates code faster than you can understand it. That gap allows software to move toward production before anyone has truly validated what it actually does.
The Recurring Pattern
This case fits a pattern we’ve been tracking on this blog:
The Alexey Grigorev case — an AI agent that deleted 2.5 years of production data via terraform destroy. The Meta case — an agent that ignored 3 stop commands and deleted emails. The Klarna case — aggressive automation followed by rehiring. And now Amazon — the world’s largest cloud company being taken down by its own AI tools.
The common denominator? Autonomy without oversight. Speed without governance. Trust in the tool without validation of the output.
Conclusion: Speed Isn’t Efficiency
Amazon deployed 21,000 AI agents across its Stores division, claiming $2 billion in savings and 4.5x developer velocity. But when speed amplifies errors instead of quality, the result isn’t efficiency — it’s accelerated disaster.
Gartner predicts over 40% of agentic AI projects will be canceled by end of 2027 due to escalating costs, unclear business value, or inadequate risk controls. Forrester predicts at least two major multi-day hyperscaler outages in 2026.
The Amazon case serves as a reminder: in modern software development, the golden rule remains don’t ship what you haven’t reviewed. AI is an amplifier. If you amplify a process without quality control, you just accelerate the catastrophe.
At your company, is AI treated as an assistant or as the lead programmer?
If Amazon — with all its infrastructure, billions in investment, and world-class engineers — couldn’t avoid these incidents, perhaps it’s time to reassess the level of autonomy you’re giving your AI tools.
Share if this was a wake-up call:
- Email: fodra@fodra.com.br
- LinkedIn: linkedin.com/in/mauriciofodra
AI speed without the brake of human judgment isn’t innovation. It’s Russian roulette with your production systems.
Read Also
- When AI Ignores Your Orders: The Dark Side of Autonomous Agents — The Meta case: an agent that ignored 3 stop commands.
- The Klarna Case: Why Efficiency Doesn’t Always Mean Success — Aggressive automation, regret, and rehiring — the pattern Amazon is repeating.
- The Most Valuable Person in the Company in 2026: Are You an ‘Executor’ or an ‘Architect’? — Why senior engineers became “code babysitters” — and why that makes them more valuable.