The Intelligence Tax: Why Cheaper AI Tokens Are Making the Productivity Gap Worse

The story technology tells about itself is almost always the same story: it begins expensive and becomes cheap, and as it becomes cheap, it spreads. The transistor. The personal computer. Mobile data. Cloud storage. Each wave confirmed the pattern. Prices fell. Access broadened. The digital divide remained real but its direction was clear, and time was on the side of the latecomers. Patience was the cost of entry.

Artificial intelligence is being told through the same story, and the numbers seem to support it. The cost of running a large language model has fallen by a factor of roughly 1,000 in three years. Research from Andreessen Horowitz and Epoch AI independently documents this: in November 2021, GPT-3 class performance cost $60 per million tokens; by late 2025, equivalent capability could be obtained for around $0.06. A reduction of three orders of magnitude in fewer than four years, faster than any comparable decline in the history of computing.

It is a genuinely remarkable technical and economic achievement. And it is completely insufficient as a guide to whether AI will broaden or narrow global economic opportunity. Because the cost per token is not the cost of being productively competitive with AI. To understand that cost, you have to look at what happened to consumption.

I The Regression

The Price That Fell and the Bill That Rose

Here is the price history of frontier AI performance, measured in cost per million tokens at equivalent benchmark capability:

Cost per million tokens at equivalent frontier performance level · 2021 to 2026

Nov 2021 GPT-3 — state of the art at public launch $60.00 /M

Mar 2023 GPT-4 launch — new frontier, premium pricing $30–60 /M

Nov 2023 GPT-4 Turbo — 50% input token reduction $10–15 /M

May 2024 GPT-4o — multimodal, halved again vs Turbo $4–5 /M

Jan 2025 DeepSeek R1 — 90% below incumbent pricing $0.55 /M

Early 2026 GPT-4 equivalent capability, cheapest provider $0.04–0.40 /M

Epoch AI's research finds prices declining between 9x and 900x per year depending on the performance benchmark, with a median of 50x per year and an acceleration post-2024 where the median rate jumped to 200x. On a per-unit basis, AI has never been cheaper. Now look at what happened to consumption over the same period.

Weekly token volume growth, 2024 to 2025

3,800%

Year-over-year increase in tracked token consumption measured by OpenRouter. Growth was steady through 2024, then sharply accelerated in January 2025.

Google AI monthly token processing, Oct 2025

1.3 Quad

Quadrillion tokens per month across all Google AI surfaces. Up from 480 trillion in May 2025, more than doubling in five months.

Microsoft Azure AI, Q3 2025

100T

Trillion tokens in the quarter, 5x year-over-year. A record 50 trillion in a single month, per Satya Nadella's earnings commentary.

Average tokens per request growth, one year

Average input token length per request grew roughly fourfold between early 2024 and late 2025. Reasoning models add a further 100x multiplier for complex tasks.

Token prices have collapsed. Token consumption has exploded. And the volume of tokens required per unit of work has grown sharply, as reasoning models, agentic workflows, and longer context windows all multiply the tokens consumed per productive session. The total cost of operating at the AI frontier has not tracked the per-token price curve. For many serious users, it has moved in the opposite direction.

II The Floor

The Subscription Floor and the Global Wage Ceiling

For individual professionals, the economics of AI access converge not on API pricing but on subscription tiers. Those tiers carry a structural property that the falling token price curve does not: they are denominated in US dollars and priced globally at the same rate.

Anthropic's Claude Max plan, currently $100 per month for the 5x usage tier and $200 for the 20x tier, is designed for knowledge workers who use AI as a primary professional tool. At $200 per month, real-world developer testing has documented months where equivalent API consumption would have cost over $5,600. For heavy daily users, the subscription floor is the only commercially rational option. You choose it not because it is cheap, but because the pay-as-you-go alternative is considerably more expensive.

For a US knowledge worker, $100 per month sits at a familiar and comfortable price point. It is roughly what Americans spent on a standalone cable TV subscription before the cord-cutting era, a cost that millions absorbed into household budgets without much deliberation. It is the price of a modest dinner for two at a mid-range restaurant. In the context of a US junior developer salary of around $75,980 per year, it is 1.6% of gross monthly income: the definition of immaterial.

The table below shows what that same fixed cost represents against entry-level developer salaries across a range of economies:

Claude Max 5x plan ($100 / month USD) vs. entry-level software developer monthly salary

Country Monthly salary (USD equiv.) Max as % of salary Purchasing power equivalent

United States ~$6,330 / mo 1.6% A modest dinner for two

United Kingdom ~$3,220 / mo 3.1% A standard professional subscription

Singapore ~$4,140 / mo 2.4% Within normal range for work tooling

India ~$644 / mo 15.5% US equivalent: ~$980 / month

Philippines ~$661 / mo 15.1% US equivalent: ~$955 / month

Vietnam ~$614 / mo 16.3% US equivalent: ~$1,030 / month

Nigeria ~$605 / mo 16.5% US equivalent: ~$1,045 / month

Sources: Gini Talent Global Software Engineer Salary Guide 2025, TalentJDI Developer Salary Comparison 2025, US Bureau of Labor Statistics 2025. Monthly figures from annual averages. Purchasing power equivalents proportional to US junior developer annual salary of $75,980 (BLS).

For an entry-level developer in Nigeria, Vietnam, the Philippines, or India, that same $100 represents between fifteen and seventeen percent of their entire monthly income. The purchasing power equivalent, mapped against US salary levels, sits between $955 and $1,045 per month. No professional tooling subscription in the US market at that price point is marketed as a standard productivity upgrade. That is the cost range of specialised legal research platforms and advanced trading terminals. Not an AI reasoning assistant.

For a developer in San Francisco, Claude Max costs less than a restaurant dinner. For a developer in Lagos, Hanoi, or Manila, the purchasing power equivalent is over $1,000 a month. The tool is identical. The economics are not in the same world.

III The Arbitrage Inversion

When Tokens Cost More Than the Labour They Replace

The global technology industry has been built, in significant part, on labour arbitrage: the ability to hire skilled technical workers in lower-cost economies and deploy their output in higher-value markets. The salary differentials in the table above are not a side effect of globalisation. They are its mechanism. A developer in Bangalore or Lagos providing services to a company in London or New York generates margin from the gap between the two wage levels. This has been the structural logic of the offshore software industry for three decades.

Token economics introduce a direct challenge to this logic, and it operates in a direction that is not yet widely discussed. When an entry-level developer in India earns approximately $644 per month, their effective daily labour cost, inclusive of overheads, is in the range of $25 to $35. A developer using AI tools intensively, running extended reasoning chains, multi-file code analysis, and iterative generation across a working day, can consume token volumes at equivalent API rates that approach or exceed that daily labour cost. At the subscription tier, the monthly cost of AI tooling for that developer represents 15% of their salary: a cost their employer in a high-income market would expense without registering, but which radically alters the unit economics of the offshore model when multiplied across a team.

The inversion is not yet universal, and token prices continue to fall. But the consumption ceiling does not hold still. As AI capability advances and agentic workflows consume tokens at volumes that dwarf today's usage, there is a realistic near-term scenario where the token cost of a fully augmented developer in a low-wage market rivals, or exceeds, the cost of the labour being augmented. At that point, the economic logic of arbitrage collapses from below: you are not hiring cheaper labour to perform expensive work, you are buying expensive cognition to deploy through cheaper hands.

This creates a structural pressure that will force a reckoning with how token budgets are allocated. And that reckoning reveals the more profound economic shift: the question of which work deserves token spend, and which does not, is rapidly becoming the most important operational decision in knowledge work.

IV The Allocation Problem

Labour Rises When It Learns to Direct the Machine

As token costs become a material line item, the value of human judgement about where to deploy those tokens rises in step. This is not a consolation for displaced workers. It is a genuine structural shift in what human labour is for in an AI-augmented economy, and it carries significant implications for who captures value in the new order.

The crude version of the AI labour story is substitution: machines replace humans for cognitive tasks, wages fall for those whose work is automated. The more accurate version is stratification. The workers who understand which problems deserve intensive token spend, which decisions require multi-step reasoning, which outputs justify the cost of the most capable models, and which tasks should be routed to cheaper inference or handled by humans entirely, are performing a function that AI cannot efficiently perform on itself at scale. They are optimising the allocation of cognitive resources. That is, structurally, what management has always been. AI does not eliminate the need for that judgement. It elevates its economic importance and concentrates its reward.

The implication for global labour markets is pointed. The ability to develop token allocation judgement requires exposure to frontier AI tools operating at significant depth and volume. You cannot learn to optimise what you cannot afford to consume. Workers in markets where the cost of full AI access represents a material fraction of income are not merely disadvantaged in productivity terms. They are disadvantaged in developing the specific meta-skill that will command premium value in the next decade of knowledge work. The access gap compounds into a capability gap, and the capability gap compounds into a wage gap that is structural rather than cyclical.

Meanwhile, the optimisation pressure itself will drive price differentiation. Enterprises in high-income markets will route complex, high-stakes work to the most capable and expensive models, and standardise routine tasks on cheaper inference. The model they develop for intelligent token routing will itself become a competitive asset. The organisations without the scale or capital to develop that routing intelligence will default to one-size consumption and pay a structural premium relative to their output. The efficiency gains of cheaper tokens will accrue unevenly, weighted toward the institutions sophisticated enough to exploit them.

V The Shape

A Divide That Compounds With Every Capability Improvement

The falling cost of a token is not the same as the falling cost of AI advantage. One measures the price of a unit of computation. The other measures what it costs, month after month, to operate at the productivity level the current frontier makes possible. Those two numbers have diverged, and the divergence is structural rather than temporary. Token consumption grows faster than token costs fall. The Jevons mechanism is not a phase. It is the operating condition of a technology whose applications expand continuously with its capability.

Each advance in model capability lifts the access floor alongside the capability ceiling. The minimum viable level of AI access for competitive knowledge work rises with the frontier, because the disadvantage of inferior tooling compounds continuously against peers who are not resource-constrained. This is categorically different from previous technology cycles, where progress was additive: the frontier moved while the affordable minimum held. With AI, the two move together.

What is taking shape is not a binary divide between those who have AI and those who do not. It is a stratified spectrum of access, with total productive capability correlating with the financial capacity to sustain high and growing token consumption. That spectrum will map, with uncomfortable precision, onto existing economic geography. And unlike the digital divide of the 1990s, which was a divide between connected and unconnected, this is a divide between richly consuming and partially consuming, at a moment when the distance between those positions is expanding rather than closing.

The history of infrastructure economics offers partial precedents. Electricity, telecommunications, and broadband all began as commercial products and became public goods, over decades and through contested transitions. The token economy is at an earlier and more fluid stage than any of those transitions were at a comparable moment. Its final topology is not fixed. But the forces shaping it are already visible, already compounding, and already producing outcomes that become harder to reverse with each passing quarter of widening distance.

The PC era asked developing economies to cross a hardware threshold and then stay there. Once crossed, the tools were owned. The token economy asks them to sustain a recurring cost that rises with ambition, scales with competitiveness, and is priced in a currency entirely uncoupled from local wages. That is a different ask. And the labour arbitrage model that has underpinned decades of global technology development sits directly in the path of a mechanism that has not yet been fully priced into how that model will evolve.

Recognising the paradox of falling prices and rising access costs is the first step. The next one is considerably harder, and there is not yet much sign that the institutions capable of addressing it have begun to take it seriously.