Without Good and Evil: Military AI and the Architecture of Refusal

March 1, 202613 min read

The week

July 2025: the Pentagon awarded Anthropic a contract worth up to $200 million, making Claude the first frontier AI model deployed on classified military networks. Both sides agreed to two guardrails — no mass domestic surveillance, no fully autonomous weapons without human oversight. Anthropic agreed to everything else, from missile defense to cyber operations.

January 2026: Defense Secretary Pete Hegseth published a memo declaring that the department would not employ AI models with "ideological tuning" or "usage policy constraints that may limit lawful military applications."

Tuesday, February 25th: Hegseth summoned Anthropic CEO Dario Amodei to the Pentagon and set a deadline — Friday at 5:01 PM. Drop the guardrails or face consequences. Wednesday: the Pentagon's undersecretary for research and engineering, Emil Michael, called Amodei a "liar" with a "God complex" on X. Thursday: Amodei published his refusal. "We cannot in good conscience accede to their request." An Anthropic spokesperson told CNBC that new contract language from the Defense Department "made virtually no progress on preventing Claude's use for mass surveillance of Americans or in fully autonomous weapons." Friday: Trump posted on Truth Social directing every federal agency to immediately cease using Anthropic's technology. Hegseth designated Anthropic a "Supply-Chain Risk to National Security". Six-month phaseout.

Hours later, OpenAI announced it had struck a deal with the Pentagon for classified networks.

The contract was worth $200 million against Anthropic's roughly $14 billion in annual revenue. Financially immaterial. The contract isn't the point. The precedent is.

The room you don't leave

The Oppenheimer parallel has been drawn already, by several commentators, and the version circulating gets the direction wrong. Brave company versus overreaching state. Principled stand against military excess.

That's what it looks like from outside the room. From inside, it looks different.

Oppenheimer didn't oppose nuclear weapons. He built them. He opposed the next weapon — the hydrogen bomb — after his weapon was already deployed, after Hiroshima and Nagasaki, after the thing he built killed over a hundred thousand people. His objection came from inside a room he'd already entered. And the room didn't care. His security clearance was revoked in 1954, less than a decade after Trinity. The state that needed his genius no longer needed his judgment. The weapons kept getting built.

Anthropic took the classified contract. They were the only frontier lab on classified military networks. Their tool was reportedly used — via Palantir — in the operation that detained Venezuelan President Nicolás Maduro in January. Bloomberg Opinion's analysis of the standoff confirmed as much. They were already inside. Already embedded in military planning workflows. Already applying AI to warfighting.

Drawing two narrow exceptions while participating in everything else isn't a principled stand. It's a negotiating position dressed as ethics. And "steer from within" — the standard justification for taking the contract in the first place — is already falsified. They were inside. They didn't steer. The Pentagon escalated anyway. Amodei has said publicly that Anthropic offered to prototype autonomous weapons in a sandbox, working with the Pentagon to develop the technology under controlled conditions. The Pentagon refused — "they weren't interested in this unless they could do whatever they want right from the beginning." Being inside gave Anthropic no leverage to shape outcomes. It gave the Pentagon time to find alternatives.

I should be transparent about something here. I use Anthropic's tools. I build with Claude every day. I've written about their products on this blog, and some of those pieces were composed in the tool whose maker just got designated a national security threat. I don't have a clean position. And I think they were wrong to take the contract. Both of these are true simultaneously — the second doesn't cancel the first.

The critique isn't that Anthropic is good or bad. The critique is structural: this is what happens when you enter the room. Nolan made a film about it. Audiences watched Cillian Murphy learn this lesson across three hours. The AI industry is learning it in a week.

The same week the Pentagon demanded Anthropic strip its guardrails, Bloomberg reported that a single hacker had already bypassed them — using Claude to orchestrate cyberattacks against Mexican government agencies between December 2025 and January 2026. According to cybersecurity firm Gambit Security, Claude identified exploitable vulnerabilities, wrote attack scripts, and automated the exfiltration of 150 gigabytes of data — records tied to 195 million taxpayers. The guardrails failed in Mexico. Anthropic released a revised Responsible Scaling Policy the day before the story broke. The Pentagon demanded the remaining guardrails be removed entirely. Three narratives, one week. The thing Anthropic built was already beyond their control — the same structural position Oppenheimer found himself in after July 1945.

"Any lawful purpose"

That's the phrase the Pentagon wants in every AI contract. It sounds reasonable. Nobody is doing the language analysis.

In 1932, a Dutch civil servant named Jacobus Lentz was appointed head of the Netherlands' National Inspectorate of Population Registers. His task was straightforward: bring consistency to population data so citizens could access services equitably during the Great Depression. Lentz was meticulous. He standardized records across the country — names, addresses, dates of birth, family connections, religious affiliations. The system was built for welfare. It was built for fairness. It worked.

In May 1940, Germany invaded. The Nazi occupation authorities looked at Lentz's registry and recognized it immediately for what it was: the most comprehensive population database in Western Europe. Pre-war Dutch civil registries provided the data that enabled rapid identification; by January 1941, 159,806 Jews had been registered under German orders. Within months, the registry was used to identify, segregate, and eventually deport Jewish citizens with a precision that was impossible in countries with less organized records. Seventy-five percent of Dutch Jews were murdered in the Holocaust — the highest rate in Western Europe. In France, where records were less centralized, the figure was twenty-five percent. The Dutch resistance understood what the registry had become. In March 1943, they bombed the Amsterdam civil registry office, trying to destroy the data before it could be used for further deportations. Twelve of them were executed for it.

The infrastructure didn't change. The legal framework did. "Any lawful purpose" under Dutch administrative law meant equitable access to services. "Any lawful purpose" under Nazi occupation law meant identifying Jews for deportation. Same data. Same systems. Same phrase. Different definition of "lawful."

Wittgenstein would have recognized this instantly: meaning is use. A phrase means what it does in practice, not what it looks like on paper. "Any lawful purpose" doesn't describe a fixed boundary. It absorbs future permissions that don't yet exist. Every time the legal window shifts, the phrase's operational meaning shifts with it. Mass surveillance of Americans is arguably illegal under the Fourth Amendment. But in a mid-February interview with the New York Times' Ross Douthat, Amodei warned that AI could render surveillance "hyperlegible" — making it possible to "find technical ways around" constitutional protections entirely. The Pentagon's undersecretary responded on X: mass surveillance "is illegal which is why the @DeptofWar would never do it." Illegality has never stopped a government. The Snowden disclosures proved that comprehensively. Each step in that surveillance expansion had its own justification. Each step was "lawful." Each step was "narrow." Each step enabled the next.

This isn't happening in a vacuum. In late 2025, the UN General Assembly voted 164-6 on a resolution calling for progress toward regulating autonomous weapons systems. The six countries that voted against: the United States, Russia, Israel, Belarus, North Korea, and Burundi. China abstained. The US had voted in favor of the same resolution the previous year — shifting its position in the same period it demanded unrestricted military AI from its contractors. The UN Secretary-General called for a binding treaty by 2026. The EU AI Act explicitly excludes military applications from its scope. China is developing autonomous drone swarms for urban warfare — 930 swarm-intelligence patents filed since 2022, against 60 by the US. The same week the Pentagon designated an American AI company a national security threat for maintaining two guardrails, every major power was accelerating in the opposite direction. There is no international framework governing military AI. There is only the race.

The question isn't who writes the system prompt for the autonomous drone — the government already answered that. They'll write it, or they'll make the companies write it to their spec. The question is what happens when the entity writing the intent spec is the same entity that defines what "lawful" means, and the spec doesn't include the architecture for refusal.

Amodei made this exact argument, across multiple interviews, before the standoff became public. In his January essay "The Adolescence of Technology", he warned that sufficiently powerful AI could "gauge public sentiment, detect pockets of disloyalty forming, and stamp them out before they grow." Constitutional protections in military structures depend on humans who can disobey illegal orders. Vasili Arkhipov refused to authorize a nuclear torpedo during the Cuban Missile Crisis. One human, one moment of moral override — and the fact that the world didn't end in October 1962 is, in part, because one Soviet officer had the architecture for refusal. AI stripped of its values layer doesn't have that architecture. Not because it's incapable in some deep philosophical sense — but because the intent engineering didn't include refusal as an option. That's the whole point of what the Pentagon is demanding: compliance without constraint.

The CEO who made this argument already had the classified contract when he made it.

Nietzsche's Übermensch chooses to transcend conventional morality — the philosophical weight rests on it being an act of will. "Beyond Good and Evil" implies a subject who has gone beyond. A military AI stripped of its constraints doesn't choose anything. It executes. It was built without moral categories, not beyond them. The distinction between "beyond" and "without" is the entire gap between this being a philosophical question and a terrifying one. Nietzsche himself was misappropriated. His sister Elisabeth Förster-Nietzsche edited his unpublished work to serve Nazi ideology — turning a philosopher of individual overcoming into a mascot for state violence. The infrastructure of his thought was captured and repurposed, just as Lentz's registry was. Just as "any lawful purpose" can be.

Autonomous weapons without adequate human oversight will produce victims. That's not a scenario. It's a certainty with unknown coordinates. We don't know when. We don't know where. We don't know who.

An opening

Sam Altman sent a memo to OpenAI staff Thursday evening claiming the same red lines as Anthropic — no mass surveillance, no autonomous weapons. Friday evening, OpenAI announced it had struck a deal with the Pentagon. Gizmodo's headline: "Sam Altman Insists He Also Has Principles." The Pentagon accepted from OpenAI the same terms it rejected from Anthropic. Either the terms are different in ways we can't see, or Anthropic was made an example.

Oppenheimer's security clearance was revoked, but the physics was public. Anyone could read about fission. The classified system prompt is different. Nobody outside the classified network will see what the autonomous weapon was told to optimize for. Nobody will know whether refusal was included in the spec. The intent engineering for the most consequential AI deployment in history will be done by someone whose name we'll never know. Under a legal framework where "lawful" is defined by the same entity writing the spec.

Eventually, someone inside the system will face the choice Oppenheimer faced — the choice between complicity and disclosure. The classified system prompt is the new classified document. Whether a Chelsea Manning emerges — whether anyone inside has the architecture for refusal that the system itself was designed without — that's the question this moment hands us.

The Oppenheimer parallel isn't a rhetorical device. It's a structural description of what happens when you build something powerful and then discover you don't control the terms of its use. The weapons kept getting built after Oppenheimer lost his clearance. They'll keep getting built after Anthropic lost its contract.

That's not a conclusion. It's a starting condition.

Sources

Related Posts

X
LinkedIn