
Google Caps Meta’s AI Access as Compute Shortage Ends ‘Tokenmaxxing’ Era
Capacity constraints force hyperscalers to impose cost discipline, with global AI investment now exceeding $850 billion and chip supply sold out through 2026.
Google has restricted Meta’s access to its Gemini AI models after the social media company’s demand exceeded allocated capacity, a move that disrupted internal projects and prompted Meta to instruct staff to curb token consumption. The cap, first reported by the Financial Times and confirmed by multiple outlets, remains in place and has affected other Google Cloud customers, though to a lesser degree. Google’s own cloud backlog nearly doubled in the first quarter, with chief executive Sundar Pichai citing compute constraints as a brake on revenue growth.
The squeeze extends well beyond a single supplier. Memory chipmakers SK Hynix, Samsung and Micron have sold out most high-bandwidth memory through 2026, while rental prices for Nvidia’s H100 processors have risen roughly 30 per cent since November. The result is the collapse of “tokenmaxxing”—the practice of treating token usage as a productivity proxy. Uber burned through its full-year AI budget in four months, and Meta engineers consumed an estimated 60 trillion tokens in 30 days at a cost approaching $900 million. Both firms have since deleted internal leaderboards for token use, and Australian research shows one in three local organisations exceeded their AI budget last financial year, with 32 per cent pausing or cancelling deployments.
Viewed from financial markets, the supply-demand imbalance is reshaping the investment narrative. Macquarie analyst Viktor Shvets describes the AI cycle as a sequence of “rolling bubbles” moving from large language models to applications, with global AI-related investment running at roughly $850 billion in 2026—$500 billion above the pre-AI trend. Annualised AI revenues are estimated at $175 billion, enough to cover operating costs and depreciation, but the Bank for International Settlements warns of classic bubble characteristics, including off-balance-sheet vehicles and circular investment structures. Beijing’s push to commoditise the AI stack adds a structural threat: Chinese systems are now matching leading US models on cybersecurity features, narrowing the technological lead to an estimated 10–15 per cent and eroding pricing power in both models and chips.
Engineering teams are responding by shifting from massive foundational models to specialised Small Language Models that can be hosted locally at a fraction of the cost. In Sydney, Deloitte’s national AI lead David Alonso described the moment as “the end of the era of AI subsidy,” while analysts in Mumbai note that Indian IT firms could reverse a 30 per cent year-to-date share-price decline if they demonstrate tangible revenue gains from helping clients deploy AI efficiently. The next factual milestone is whether the $2 trillion contract backlog and heavy data-centre spending translate into measurable productivity improvements, or whether the rolling bubble begins to deflate as cost pressures force a broader reassessment of returns.
How the same story is told elsewhere.
2 editorial groups · 3 languages
The Atlantica bloc did not publish articles on this story. Its absence suggests a lack of priority for AI and tech competition topics.
The Indian-South Asian bloc did not publish articles on this story. Its absence indicates a focus on domestic issues such as defense, crime, and gold.
Broaden your view
Iran Begins Week-Long Khamenei Funeral as Successor Stays Out of Sight
7 languages · 30 outlets
From TechnologyAlibaba bans Claude Code after hidden tracking code discovered
4 languages · 4 outlets
From Science & HealthModern life's invisible wear: how daily stress becomes physical illness
5 languages · 11 outlets