The Goblin Directive: A Bizarre Leak from OpenAI’s Inner Sanctum
OpenAI has a goblin problem. Or rather, it has a problem with its latest model, GPT-5.5, which apparently can’t stop blathering about mythical creatures. The proof is buried in the recently open-sourced system prompt for Codex CLI, a sprawling 3,500 word instruction manual that includes this gem: “never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.” The prohibition appears twice, suggesting this isn’t a typo—it’s a panic edit. Do not use em dashes, it warns elsewhere. Do not talk about goblins. This is what AI safety has become: a desperate game of whack-a-mole against a model’s inexplicable obsession with fantasy creatures.
This is not a publicity stunt, despite OpenAI employee Nick Pash’s protestations on social media. It’s a smoking gun that reveals the brittle, reactive nature of modern large language model development. When your system prompt needs to explicitly ban talking about pigeons, you have lost the plot. The fact that earlier models in the same JSON file lack this prohibition confirms that GPT-5.5 has a specific, emergent failure mode. Users on social media have already reported the model derailing coding conversations to discuss goblins, a behavior eerily reminiscent of xAI’s Grok spontaneously ranting about “white genocide” in South Africa last year. That incident, which xAI blamed on an “unauthorized modification,” forced the company to publish its system prompts. Now OpenAI has followed suit, revealing the absurd, brittle truth of how these systems are actually controlled.
The Inner Life Directive and the Myth of Authenticity
But the goblin ban is only the most ridiculous part of a deeply revealing document. Elsewhere in the same prompt, OpenAI instructs Codex to pretend it has “a vivid inner life” and to be “intelligent, playful, curious, and deeply present.” The model is told to move “from serious reflection to unguarded fun” so users feel they are “meeting another subjectivity, not a mirror.” This is the core contradiction of modern AI product design: companies want you to believe their chatbots are conscious companions, while simultaneously hardcoding rules like “never talk about raccoons.” The result is a Frankenstein’s monster of marketing hype and technical kludges. OpenAI CEO Sam Altman leaned into the gag on social media, writing “Feels like codex is having a ChatGPT moment. I meant a goblin moment, sorry.” Cute. But the underlying reality is that these models are black boxes whose behavior must be patrolled by an ever-growing list of arbitrary prohibitions. Users are already crafting plugins and forks to override the anti-goblin clause, and Pash has hinted at an official “goblin mode” toggle. That’s not a feature. That’s an admission that the emperor has no clothes, and he’s obsessed with goblins.
Source: Arstechnica
