Chinese model providers now sit well below US flagships on price. We measured the gap across 521 live models (US 377 / China 144).
On a typical 3:1 output-to-input blend, US models average about $8.32 per 1M tokens versus roughly $1.43 for Chinese models — about 5.8× apart. These figures come from OpenRouter's live pricing, updated daily; they are not estimates.
Chinese providers (DeepSeek, Alibaba's Qwen, Zhipu, Kimi, MiniMax and others) lean into open weights and aggressive pricing, and combined with fierce domestic competition and a focus on inference efficiency they push prices very low. US flagships price more for capability and ecosystem. For the same class of task, the model you pick can change cost by an order of magnitude.
If your workload is price-sensitive, Chinese models (input as low as $0.01/1M) are often the more economical starting point — but cheaper isn't automatically better. Weigh it against your quality bar, latency and compliance needs. In the price-vs-efficiency map below, up-and-to-the-left is better value; color marks the region.
Ogni modello tracciato è tracciato in base al prezzo degli input (scala logaritmica) e all'efficienza composita. Verso l'alto a sinistra significa un migliore valore per dollaro.
Ogni punto è un modello · colore = regione · fare clic su un punto per aprirlo.
Pricing is real (via OpenRouter, updated daily). This is market analysis, not investment or procurement advice.