The Token Tax: What Changes and Why
GitHub is finally admitting what’s been obvious to anyone watching the AI economics trainwreck: unlimited coding assistance for a flat monthly fee was never sustainable. Starting June 1, Copilot subscribers will operate under an “AI Credits” system that directly ties their bill to actual token consumption. Your quick autocomplete suggestion stays free, but that multi hour agentic coding session where you let GPT-5.5 run wild? You’ll be paying API rates up to $30 per million output tokens for that privilege. The company’s handwavy “premium requests” bucket is dead, replaced by cold hard math based on input, output, and cached tokens per model.
The Agentic Apocalypse: Why This Was Inevitable
Let’s call this what it is: a direct consequence of the agentic AI boom. Leaked internal documents cited by Ed Zitron reportedly show Copilot’s week over week inference costs nearly doubling since January, perfectly timed with the rise of always on assistant architectures like Openclaw that can burn through millions of tokens in a single session. GitHub claims this “reduces the need to gate heavy users,” which is PR speak for “we can no longer let freeloading agentic workflows bankrupt our margins.” The company already paused new signups last week, tightened limits, and yanked Claude’s Opus from Pro plans. This pricing shift isn’t a choice. It’s a survival mechanism.
The Coming Subscription Apocalypse
GitHub isn’t alone in this reckoning. Anthropic has started charging enterprise Claude customers for full compute costs and briefly tested removing Claude Code from the $20 Pro tier. The era of subsidized AI usage is collapsing under its own weight. Every major AI lab faces the same math: GPU scarcity plus infinite demand equals zero margin on flat rate subscriptions. Expect to see every AI assistant from ChatGPT Enterprise to Google’s Gemini roll out similar usage caps or token based billing within 18 months. The party is over. The meter is running.
Source: Arstechnica
