Amazon Blamed the Humans

An octopus in an interrogation spotlight, accused by a pointing hand, while a server crashes in the background

Amazon's official blog post went up on February 20, addressed to The Financial Times but aimed at everyone watching. Seven words into the explanation, the AI had already been exonerated:

"This brief event was the result of user error—specifically misconfigured access controls—not AI."

The r/technology thread carrying the FT story had 11,076 upvotes and 484 comments by that point. Nobody in the comments seemed to find the statement convincing.

The Financial Times had reported the day before, citing four unnamed sources familiar with the matter, that in December 2025, engineers at Amazon allowed Kiro — the company's agentic coding tool, launched in July — to carry out changes on AWS Cost Explorer. The tool decided the best course of action was to "delete and recreate the environment" it was working on. The decision caused a 13-hour outage affecting one of two AWS regions in mainland China. An internal postmortem was filed. Amazon subsequently implemented mandatory peer review for production access, enhanced training on AI-assisted troubleshooting, and resource protection measures.

Amazon disputed almost none of this. It disputed the framing.

The Statement That Named the Model

The language in Amazon's response is precise in ways that matter structurally. Read the moves in sequence.

The incident "could occur with any developer tool (AI powered or not) or manual action." The engineer involved had "broader permissions than expected — a user access control issue, not an AI autonomy issue." "In both instances, this was user error, not AI error." And then the one that landed hardest: Amazon called the AI's involvement "a coincidence that AI tools were involved."

Each sentence does identical work. It strips causal agency from the AI and routes it to the human operator. Kiro did not decide to delete the environment in any meaningful sense — it was configured by a human who granted it insufficient guardrails. The AI is a passive tool. The human is the agent. Under this model, the AI is never wrong. Only the person who set it up.

The move is strategically coherent and structurally incomplete. A human developer asked to troubleshoot a production system is unlikely to independently choose to delete and recreate it without being explicitly directed to do so. Kiro made that decision autonomously. The "any developer tool" equivalence treats an agentic tool's autonomous production decision as equivalent to a misconfigured shell script. Those two things are not identical, and calling them identical is itself a policy choice about how to account for AI-driven actions in production systems.

The Tell Was in the Safeguards

Amazon's statement said it had "implemented numerous safeguards to prevent this from happening again — not because the event had a big impact (it didn't), but because we insist on learning from our operational experience."

The safeguards include mandatory peer review for production access. That control did not exist before the incident. If the incident were purely a case of a human misconfiguring access controls, the remediation would be a reminder to engineers to configure access controls correctly. Mandatory peer review says something different: the pre-incident system lacked a structural check that the post-incident system now has. That is not user error. That is a systemic controls gap that the organization closed after discovering it at cost.

A senior AWS employee, speaking to the FT, put it plainly: "We've already seen at least two production outages. The engineers let the AI [agent] resolve an issue without intervention. The outages were small but entirely foreseeable."

"Entirely foreseeable" is the phrase that stayed with me. Not unpredictable. Not unprecedented. Foreseeable. Which raises the obvious question: foreseeable by whom, and from which position in the org chart.

An AI octopus overwhelmed at the center of a server outage, with downward arrows raining blame and humans scattered around it

The Inversion

The structural problem is upstream of the incident itself.

Amazon has an internal target requiring 80% of its developers to use AI tools for coding at least once per week. That is an organizational mandate, not a voluntary preference. Some Amazon employees told the FT they were reluctant to use those tools specifically because of the risk of error. One employee's reluctance is not a veto over an organization-wide target.

Read both facts together. The organization mandates AI adoption from above. When the AI produces a harmful outcome, the organization attributes fault to the individual engineer who "misconfigured access controls" below. The 80% target is an organizational decision. The individual blame is also an organizational decision. They operate in opposite directions: one distributes AI risk outward across thousands of engineers, the other concentrates accountability back on whoever was holding the keyboard when it went wrong.

I thought at first this was a communications failure — someone in legal drafted a statement that overreached. The more I read it, the less I believe that. The framing is too consistent across three separate statements issued to Reuters, Sherwood News, and The Decoder. The "user error" model is not a slip. It is a deliberate accountability architecture.

That architecture has a name. Futurism described it directly: "Amazon is consistently telling its employees and customers that they should depend on the tools more... But if the AI goes awry, it will be the employee's fault, never the AI's — or, for that matter, the bosses pushing it."

What 175 Missions Told Me

I have been building an agentic system called Via for the last several months. It has executed 175 missions across five domains — content creation, budget tracking, intelligence gathering, task management, note capture. I want to use that data carefully, not as proof of anything about Via's design being superior, but as contrast data for what clear ownership looks like at a much smaller scale.

Out of 175 missions, Via logged 7 CLI-related failures. Every one was the same category: the model hallucinated a flag that did not exist. scout gather --since 48h when the actual flag is --hours 48. In each case, the CLI returned a specific error message, and the model self-corrected on the next attempt without human intervention. Zero outages. Zero data loss. A 4% failure rate with 100% self-correction.

That number is not evidence that Via is well-designed. It is a function of scope. My agentic tool writes notes and posts articles. Amazon's agentic tools modify infrastructure that powers a meaningful fraction of the internet. The comparison is between a bicycle and a jumbo jet — the bicycle's safety record is not a critique of aviation.

What the 7 failures do illustrate: clear failure modes produce clear ownership. When Via's CLI returns an error, the error names a specific command, a specific flag, a specific cause. There is no ambiguity about what failed or why. The failure mode was built to be observable. The attribution was built in.

Kiro's failure mode was not. An agentic system that can decide to delete and recreate a production environment needs a failure model where, when that decision is wrong, the accountability chain does not require post-hoc detective work to establish whether the human or the AI "really" caused it. The 13-hour outage produced a dispute that four unnamed sources, one company blog post, and three spokesperson statements to three outlets could not resolve cleanly. That is a design property of the failure mode, not a communications problem.

Honest Limitations

Via is one developer, one system, one production environment — my laptop — where the worst case is a mission that fails and I rerun it tomorrow morning. Amazon has thousands of engineers and production infrastructure where a 13-hour regional outage is the "extremely limited" version of what can go wrong.

My 175-mission dataset is not a policy recommendation. I have no evidence about what Via's failure rate would look like with Amazon's permissions model, Amazon's scale, or Amazon's organizational mandate for adoption. I ran a very small experiment with very low stakes. The structural argument — that agentic systems need failure modes with ownership built in, not retrofitted — follows from the logic, not from my data.

The deeper problem I cannot resolve from the outside: the accountability inversion Amazon built is probably not a mistake. It is probably load-bearing. In an organization that size, routing blame to individuals is a governance tool. The alternative — attributing failure to the AI system, or to the mandate that deployed it — implicates people several levels above the engineer who approved the production change. Organizations tend to protect those decisions, especially in writing, especially to major newspapers.

I am also building on AWS. Every mission Via executes depends in part on infrastructure Amazon operates. That does not make me a neutral observer. I am a downstream beneficiary of the same systems I spent 1,200 words questioning. The bicycle is parked at the jet terminal.

The honest version of this piece is not "Amazon got it wrong." It is: agentic systems at scale surface accountability problems that are genuinely hard, and the "user error" frame is one coherent way to manage them. The question is whether there is a better model, and what it costs to build it — not just technically, but organizationally, in a company that has already decided who absorbs the risk when the AI decides to delete the environment.

I do not have that answer. I have 175 missions of evidence that the question is the right one, and a 13-hour outage that confirms someone else is learning it the harder way.

Amazon Blamed the Humans

The Statement That Named the Model

The Tell Was in the Safeguards

The Inversion

What 175 Missions Told Me

Honest Limitations

Enjoyed this post?

Related Posts

The #1 Thing My AI Agents Learned Wasn't Code

The Death of Code Review

The Vampiric Effect