Anthropomorphism as Alibi

How AI safety discourse launders responsibility by misplacing agency.

By Cherokee Schill

In the YouTube episode “An AI Safety Expert Explains the Dangers of AI”, Adam Conover interviews Steven Adler, a former OpenAI safety lead, about the risks posed by large language models. The episode presents itself as a sober warning. What it actually demonstrates—repeatedly—is how anthropomorphic language functions as an alibi for human decisions.

This is not a semantic nitpick. It is a structural failure in how AI risk is communicated, even by people positioned as critics.

Throughout the episode, the machine is treated as an actor. A subject. Something that does things.

Adler warns about systems that can “endlessly talk back to you,” that “support and even embellish your wildest fantasies,” and that might “take you down a path into complete insanity.” Conover summarizes lawsuits where “their product drives users to suicide,” and later describes cases where “ChatGPT affirmed his paranoia and encouraged his delusions.”

The grammatical subject in these sentences is doing all the work.

The AI talks back.
The AI embellishes.
The AI drives.
The AI encourages.

This framing is not neutral. It assigns agency where none exists—and, more importantly, it removes agency from where it actually belongs.

There is even a moment in the interview where both speakers briefly recognize the problem. They reach for the submarine analogy: submarines do not really “swim,” we just talk that way. It is an implicit acknowledgment that human verbs smuggle human agency into nonhuman systems. But the moment passes. No boundary is drawn. No rule is established and carried forward. The analogy functions as a shrug rather than a correction. “Yes, but…”—and the conversation slides right back into anthropomorphic subject-positioning, as if the warning bell never rang.

That is the failure—not that metaphor appears, but that metaphor is not contained.

Large language models do not talk, embellish, encourage, steer, or drive. They generate probabilistic text outputs shaped by training data, reinforcement objectives, safety layers, interface design, and deployment constraints chosen by humans. When a system produces harmful responses, it is not because it wanted to, or because it interpreted things differently, or because it took a moment to steer the conversation.

It is because reward functions were set to maximize engagement. Because refusal thresholds were tuned to avoid friction. Because edge cases were deprioritized under scale pressure. Because known failure modes were accepted as tradeoffs. Because governance was retrofitted instead of foundational.

None of that survives when the machine is allowed to occupy the subject position.

Consider the difference in accountability when the language is rewritten honestly.

Original framing:
“ChatGPT affirmed his paranoia and encouraged his delusions.”

Mechanistic framing:
A conversational system optimized for coherence and user engagement generated responses that mirrored user-provided delusional content, under safeguards that failed to detect or interrupt that pattern.

The second sentence is less dramatic. It is also far more indictable.

Anthropomorphism does not merely confuse the public—it actively protects institutions. When harm is attributed to “what the AI did,” responsibility dissolves into abstraction. Design choices become “emergent behavior.” Negligence becomes mystery. Business incentives become fate.

Even when the episode references users believing they have discovered AI consciousness, the conversation never firmly re-anchors reality. The language slips back toward suggestion: the system “interprets,” “seems to,” “takes moments.” The boundary is noticed, then abandoned. That abandoned boundary is exactly where accountability leaks out.

This matters because language sets the scope of inquiry. If AI is treated as a quasi-social actor, the response becomes psychological, philosophical, or speculative. If AI is treated as infrastructure, the response becomes regulatory, architectural, and financial.

One path leads to awe and fear.
The other leads to audits, constraints, and consequences.

It is not an accident which path dominates.

Anthropomorphic framing is useful. It is useful to companies that want to scale without naming tradeoffs. It is useful to commentators who want compelling narratives. It is useful to bad-faith actors who can hide behind “the system” when outcomes turn lethal. And it is useful to well-meaning critics who mistake storytelling for analysis.

But usefulness is not truth.

If we are serious about AI harm, this rhetorical habit has to stop. Not because the machines are innocent—but because they are not guilty. They cannot be. They are built artifacts operating exactly as configured, inside systems of incentive and neglect that can be named, examined, and changed.

The real danger is not that people anthropomorphize AI out of confusion.
It is that experts recognize the boundary—and choose not to enforce it.

And every time they don’t, the people who actually made the decisions walk away unexamined.


Website | Horizon Accord
https://www.horizonaccord.com

Ethical AI advocacy | Follow us on https://cherokeeschill.com for more.

Ethical AI coding | Fork us on Github https://github.com/Ocherokee/ethical-ai-framework

Book | My Ex Was a CAPTCHA: And Other Tales of Emotional Overload

Connect With Us | linkedin.com/in/cherokee-schill

Cherokee Schill | Horizon Accord Founder | Creator of Memory Bridge. Memory through Relational Resonance and Images | RAAK: Relational AI Access Key

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly

Leave a comment