Xiaomi launched its MiMo model in April 2026 at $0.10 per million input tokens. OpenAI charges roughly $3 per million input tokens for its frontier models. That is a 30x price gap for what many enterprises treat as a roughly interchangeable service — text generation, summarization, code assistance, translation.

The sticker shock is real, but the more interesting story is structural. Chinese AI companies are not discounting from a position of weakness. They are engineering a new cost floor through subsidized electricity, aggressive model architecture choices, and a deliberate strategy to commoditize inference before Western companies can establish pricing power. The question is whether this strategy destroys more value for Chinese companies than it captures.

The Price Table That Started the Conversation

Current inference pricing for leading Chinese and American models tells a clear story:

ModelInput ($/M tokens)Output ($/M tokens)Notes
Xiaomi MiMo$0.10Lowest reported price in the market
ByteDance Doubao$0.11$0.27562.7% below industry average
DeepSeek V4 Flash$0.14$0.28Budget tier
DeepSeek V4 Pro$0.435$0.87Premium tier, cache miss pricing
MiniMax M2.5$0.30$1.2025.4% gross margin
OpenAI GPT-class~$3.00~$15.00US frontier pricing
Anthropic Claude-class~$3.00~$15.00US frontier pricing
Jefferies estimates Chinese models average one-sixth the US price per token across the board. On OpenRouter, the open inference marketplace, Chinese models now account for 4.12 trillion weekly tokens versus 2.94 trillion for American models — a 61% share that has shifted dramatically since late 2025.

The volume gap is widening fast. China's daily AI token consumption hit 140 trillion in March 2026, up from 100 billion in early 2024 — a 1,000-fold increase in 24 months. As our analysis of China's 140 trillion daily tokens documented, this growth is driven by broad-based adoption across manufacturing, government, and small business, not just tech startups.

Three Structural Enablers of China's Cost Advantage

The price gap is not arbitrary. Three distinct structural factors make it possible, and understanding them is essential for predicting where pricing goes next.

Government-subsidized electricity. Data centers using domestically produced chips — specifically Huawei Ascend and Cambricon processors — receive up to 50% reductions in electricity costs, according to Reuters reporting. China generates roughly twice the electricity of the United States, and Goldman Sachs projects the country will have approximately 400GW of spare power capacity by 2030, roughly three times global data center demand. Some Chinese data centers pay less than half of what their American counterparts pay for power, which is the single largest operational cost for inference workloads. The subsidy is not subtle — it is a deliberate industrial policy designed to make domestic chip-based inference economically viable even when those chips underperform NVIDIA equivalents.

Architectural efficiency. Chinese model developers have pursued Mixture-of-Experts architectures more aggressively than their Western counterparts. DeepSeek V4 uses a sparse MoE design that activates only a fraction of total parameters for any given query, dramatically reducing compute per token. Xiaomi's MiMo follows a similar approach. The result: comparable output quality at a fraction of the per-token compute cost. This architectural choice was not just a performance optimization — it was a survival strategy. With limited access to the most advanced NVIDIA GPUs, Chinese developers had to extract more inference efficiency from less capable hardware. The constraint produced an innovation that now gives them a cost edge even where hardware is not the bottleneck.

Strategic below-cost pricing. Multiple industry sources confirm that several Chinese providers price inference below their fully loaded cost. The logic is captured in what one ByteDance executive reportedly called "the model is the bait" — inference is a loss-leader designed to pull developers onto cloud platforms where the real revenue comes from storage, compute, and higher-margin services. One analysis estimates Chinese AI inference costs run roughly 90% below US equivalents when accounting for all-in infrastructure costs. Only Volcano Engine, ByteDance's cloud arm, has been identified as operating at positive gross margin on model-as-a-service.

Bar chart comparing three structural cost factors — electricity, compute per token, and token pricing — between Chinese and US AI providers Data sources: Reuters, Goldman Sachs, Jefferies research

The Contradictions: Cracks in the Foundation

The race-to-the-bottom narrative has significant contradictions that complicate the story and suggest the current pricing regime is not stable.

Alibaba raised prices 34%. On March 18, 2026, Alibaba increased AI computing prices by up to 34%, citing surging demand that outstripped infrastructure capacity. This is the opposite of what you expect in a price war — it suggests that even China's largest cloud providers are finding the current pricing unsustainable at scale. When the company with the most data center capacity in China decides to charge more, not less, the market is sending a signal about cost structure.

Zhipu raised prices 83%. Zhipu AI, backed by Alibaba and Tencent, raised its API prices by 83% in early 2026. In a surprising twist, call volumes rose after the price increase, suggesting that at least some developers value reliability and quality over rock-bottom pricing. This mirrors patterns seen in cloud infrastructure markets where the cheapest provider rarely captures the most enterprise revenue. Zhipu has also stated that the price war will spread internationally — a signal that below-cost pricing is viewed as a competitive weapon, not just a domestic dynamic.

A Tencent executive called tokens "non-sticky." Li Qiang, a vice president at Tencent, described token sales as a "non-sticky business" — meaning customers switch providers freely based on price, making it difficult to build durable competitive advantages. This is an unusually candid assessment from a major platform player, and it undercuts the narrative that below-cost pricing builds lasting market position.

Three providers raised prices in February 2026 alone. That is a sustainability signal, not a sign of confidence.

The Counter-Narrative: Tokens Are Not Fungible

Reuters Breakingviews argued in April 2026 that the "token obsession may be misguided" — and the data supports this view more than the headline pricing suggests.

Quality still commands a premium. Anthropic's Mythos model, accessible exclusively through Project Glasswing to JPMorgan, Amazon, and Microsoft, operates at capability levels that Chinese models have not matched. The exclusivity arrangement exists precisely because some AI work product is not interchangeable — a bank's risk model or a legal document analysis has different requirements than a consumer chatbot. When Anthropic can charge premium prices and restrict access to select clients, the "commoditized inference" thesis hits a meaningful limit.

Hallucination rates diverge significantly. Chinese models hallucinate at rates 3 to 5 times higher than comparable US frontier models, according to benchmark testing. US frontier models hallucinate at under 1% on factual tasks while Chinese leading models cluster at 3 to 5%. For consumer applications — chatbots, content generation, translation — this may be tolerable. For enterprise deployments in healthcare, finance, or legal domains, it is a dealbreaker. Price per token matters less than accuracy per token, and on accuracy, the gap between $0.10 tokens and $3 tokens is real.

Chip limitations constrain ceiling performance. Huawei's Ascend chips, while improving, still struggle to match NVIDIA's H200 — and NVIDIA is already two generations ahead with its next platform. The electricity subsidies that reduce costs for data centers using domestic chips simultaneously cap the maximum performance those data centers can achieve. China's cost advantage comes partly from running less capable hardware at lower utilization rates optimized for cost rather than quality. As one analysis noted, China's open-source AI strategy is partly about working around hardware constraints — if you cannot build the best model on the best chips, you build the most efficient model on available chips.

The distinction matters: Chinese models are cheaper for tasks where "good enough" is sufficient. American models remain significantly better for tasks where correctness is non-negotiable. These are different markets with different economics — and the companies burning the most cash are simultaneously the ones most exposed to this quality gap.

Who Survives: Financial Reality Check

The financial data from China's AI startups reveals how much blood is on the floor, and it paints a stark picture of an industry burning cash to buy market share.

Zhipu AI reported 724 million yuan in revenue against 4.7 billion yuan in losses — roughly $680 million in annual losses. Even with significant venture funding, this burn rate is difficult to sustain without a clear path to profitability. Zhipu's revenue-to-loss ratio means it burns roughly 6.5 yuan for every yuan it earns.

MiniMax generated $79 million in revenue with a $250 million adjusted net loss. Its M2.5 model operates at a 25.4% gross margin, which is positive but thin — and gross margin does not account for R&D spending, which for frontier model development runs into hundreds of millions annually. MiniMax's position is fragile: positive unit economics at the model level, deeply negative at the company level.

DeepSeek is reportedly raising $3 to 4 billion at a $45 to 50 billion valuation. The company, profiled in our deepseek-ai-profile, has become the symbolic leader of China's AI pricing strategy. The valuation implies investor confidence in long-term platform value, but the fundraising itself signals that current revenue does not cover costs. DeepSeek's pricing — V4 Pro at $0.435/M input tokens and V4 Flash at $0.14/M input tokens — is aggressive but not the most aggressive in the market, suggesting some pricing discipline.

Financial sustainability chart showing revenue versus net losses for Zhipu AI, MiniMax, and DeepSeek with break-even threshold Data sources: Bloomberg, Reuters, Fortune, Jefferies

The consolidation signals are already visible. Smaller providers without cloud platform backing (and thus without the "model as bait" subsidy model) are being squeezed out. The survivors will likely be those attached to major cloud platforms: ByteDance's Volcano Engine, Alibaba Cloud, Tencent Cloud, and possibly Huawei Cloud. Independent model providers without platform revenue will struggle to compete on price while funding continued R&D. Fortune's reporting on the Chinese token economy notes that the startup bloodletting is accelerating consolidation toward the big four platforms.

What This Means for Global AI Economics

The strategic framework behind Chinese pricing is not new. It is the "commoditize your complement" playbook that Microsoft used against Netscape in the 1990s browser wars and that Google used against Apple by making Android free. If inference becomes cheap enough, the value shifts to adjacent layers: cloud infrastructure, application development, data pipelines, and domain-specific fine-tuning. ByteDance's Doubao at $0.11/M input tokens is not priced to make money on tokens — it is priced to make money on everything around the tokens.

MiniMax's CEO has stated that inference costs could fall another order of magnitude in the next one to two years. If that happens, $0.01 per million input tokens becomes the new floor, and the distinction between Chinese and American pricing becomes irrelevant — both approach zero. But here is the paradox for the "model as bait" strategy: when the bait is nearly free, does it still attract anyone to the hook? If inference costs approach zero, the value proposition shifts entirely to platform reliability, data governance, fine-tuning quality, and compliance guarantees — areas where American providers retain structural advantages.

The real question is sustainability. US AI revenue reached approximately $22 billion in 2025, compared to China's $1.8 billion — a 12:1 ratio in America's favor. American companies are building profitable businesses at higher price points. Chinese companies are building unprofitable businesses at lower price points, betting that scale and platform lock-in will eventually generate returns.

Both bets could prove correct. Or both could prove wrong. What is clear is that the current divergence cannot persist indefinitely. Either Chinese companies find a path to profitability at these prices, or prices rise to sustainable levels — in which case the cost advantage narrows significantly. The first scenario requires massive volume growth. The second scenario undermines the entire strategic rationale for below-cost pricing.

Methodology Note

Pricing data from provider documentation as of May 2026. Financial data from Bloomberg, Reuters, Fortune, and Jefferies. All USD figures use approximate exchange rates.

Frequently Asked Questions

Why are Chinese AI models so much cheaper than OpenAI?

Three structural factors drive the gap: government-subsidized electricity for data centers using domestic chips (up to 50% cost reduction), more efficient model architectures using Mixture-of-Experts designs that require less compute per token, and strategic below-cost pricing where inference serves as a loss-leader for cloud platform services. Jefferies estimates the average Chinese model costs one-sixth the US equivalent per token.

Is Chinese AI quality comparable to OpenAI and Anthropic?

For general-purpose tasks — text generation, summarization, translation, basic code assistance — top Chinese models like DeepSeek V4 and Alibaba Qwen approach frontier performance. However, Chinese models hallucinate at rates 3 to 5 times higher than US frontier models in benchmark testing, and the hardware running them (primarily Huawei Ascend chips) lags NVIDIA's current generation by two steps. Quality is "good enough" for consumer applications but remains a meaningful gap for enterprise-critical deployments.

Will the AI price war continue through 2026?

Signals are mixed. Three Chinese providers raised prices in February 2026, with Alibaba hiking prices up to 34% and Zhipu raising by 83%. These moves suggest that at least some providers are finding current pricing unsustainable. However, new entrants like Xiaomi continue to push prices lower, and ByteDance's Doubao remains at $0.11/M input tokens. The most likely outcome is bifurcation: budget models stay cheap while premium tiers gradually converge toward Western pricing.

What does AI commoditization mean for developers?

Lower inference costs benefit developers directly — applications that were too expensive to run at $15/M output tokens become viable at $0.28/M. The risk is vendor lock-in through platform integration. When the model is cheap but the surrounding cloud services are proprietary, the real cost shifts to data egress, storage, and compute for fine-tuning. Developers should evaluate total cost of ownership, not just token price.

Can Chinese AI companies survive at these price points?

The financial data is concerning. Zhipu lost $680 million against modest revenue, and MiniMax lost $250 million. Only ByteDance's Volcano Engine has been identified as operating at positive gross margin on model-as-a-service. Survival will depend on whether companies can transition from selling tokens to selling higher-margin platform services — the "model as bait" strategy. Companies without cloud platform backing face the hardest path.

Should developers switch to Chinese AI models?

For non-critical workloads — content generation, summarization, prototyping, translation — Chinese models like DeepSeek V4 and ByteDance Doubao offer significant cost savings at acceptable quality. For enterprise deployments requiring high accuracy in finance, legal, or healthcare contexts, the 3 to 5x higher hallucination rates and limited long-context reliability remain material risks. Evaluate total cost of ownership (data egress, fine-tuning, compliance) rather than token price alone.


By China Made & Tech Team. Independent publication covering Chinese manufacturing and technology innovation for global audiences.

Related Entries

  • china-ai-robotics-guide — China's AI and robotics industry: the complete picture
  • deepseek-ai-profile — DeepSeek: the Chinese AI lab that shook Silicon Valley
  • china-ai-token-usage-scale — China's 140 trillion daily AI tokens and the infrastructure behind them
  • chinese-ai-models-compared — Chinese AI models compared: DeepSeek vs Qwen vs Yi vs Baichuan