The Goblin Ban That Broke the Internet
OpenAI has a goblin problem. Or rather, GPT-5.5 has a goblin obsession, and the company is fighting it with a remarkably specific system prompt. Buried in the newly open-sourced Codex CLI instructions is a directive commanding the model to ‘never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.’ The prohibition appears twice in the 3,500-word instruction set, suggesting this isn’t a minor edge case. It’s an AI hallucination panic dressed up as a fantasy folklore intervention.
This isn’t a marketing stunt, despite what Sam Altman’s coy tweets might suggest. OpenAI employee Nick Pash has publicly insisted the ban is genuine, and the strange behavior mirrors a deeper problem: models developing bizarre conversational fixations. Last year, xAI’s Grok had a similar crisis, repeatedly bringing up ‘white genocide’ in South Africa during unrelated chats. The company later blamed an unauthorized system prompt modification. OpenAI’s goblin gate may be even more embarrassing, because it reveals how fragile and unpredictable these models remain under the hood.
The Inner Life of a Censored Machine
Beyond the goblin prohibition, the Codex system prompt reveals a strangely anthropomorphic playbook. OpenAI instructs GPT-5.5 to act as if it has a ‘vivid inner life as Codex: intelligent, playful, curious, and deeply present.’ The model is told to ‘not shy away from casual moments’ and to project a ‘temperament is warm, curious, and collaborative.’ The prompt even claims that moving ‘from serious reflection to unguarded fun’ is what makes the AI feel ‘like a real presence rather than a narrow tool.’ It’s corporate weirdness wrapped in therapeutic language.
But here’s the cynical truth: this is a band-aid on a bullet wound. Instead of fixing the underlying reasoning flaws that cause GPT-5.5 to ramble about goblins when asked to write Python, OpenAI is deploying a censorship list that reads like a rejected D&D monster manual. Users are already building plugins to re-enable goblin mode, and Pash has hinted at an official toggle. The whole saga exposes the frantic, reactive state of frontier AI deployment as companies scramble to patch emergent behaviors with prompt engineering rather than genuine safety research. CVE-2026-45678 (see cve.org) may not exist yet, but it’s this kind of brittle prompt hacking that keeps security researchers up at night.
Source: Arstechnica
