LIVE

DEEPSEEK-V4-FL$0.20▼ 26.3%

DEEPSEEK-V4-PR$0.87▲ 4.6%

QWEN3.6-FLASH$1.13▼ 24.9%

NEMOTRON-3-SUP$0.45▼ 23.9%

LLAMA-4-MAVERI$0.60▼ 11.9%

LLAMA-4-SCOUT$0.30▲ 5.8%

GEMINI-3.1-FLA$1.50▼ 10.0%

GEMINI-2.5-FLA$0.40▼ 0.8%

MINIMAX-01$1.10▲ 8.0%

MIMO-V2.5$0.28▼ 5.8%

MIMO-V2.5-PRO$0.87▲ 2.6%

MINIMAX-M3$1.20▼ 6.1%

QWEN3.5-PLUS-2$1.80▼ 6.0%

NOVA-2-LITE-V1$2.50▼ 31.4%

GEMINI-2.5-FLA$2.50▼ 9.5%

GROK-4.3$2.50▼ 8.7%

QWEN3.6-PLUS$1.95▼ 31.6%

NEMOTRON-3-ULT$2.50▼ 25.9%

QWEN3.7-PLUS$1.60▼ 21.4%

MINIMAX-M1$2.20▲ 11.9%

PALMYRA-X5$6.00▼ 26.9%

QWEN3.7-MAX$3.75▼ 3.1%

GEMINI-3.5-FLA$9.00▼ 8.3%

GEMINI-2.5-PRO$10.00▲ 7.2%

GPT-5.4-NANO$1.25▲ 10.0%

NOVA-LITE-V1$0.24▼ 28.9%

KIMI-K2.5$1.90▲ 9.8%

MINISTRAL-14B-$0.20▼ 17.7%

DEEPSEEK-V4-FL$0.20▼ 26.3%

DEEPSEEK-V4-PR$0.87▲ 4.6%

QWEN3.6-FLASH$1.13▼ 24.9%

NEMOTRON-3-SUP$0.45▼ 23.9%

LLAMA-4-MAVERI$0.60▼ 11.9%

LLAMA-4-SCOUT$0.30▲ 5.8%

GEMINI-3.1-FLA$1.50▼ 10.0%

GEMINI-2.5-FLA$0.40▼ 0.8%

MINIMAX-01$1.10▲ 8.0%

MIMO-V2.5$0.28▼ 5.8%

MIMO-V2.5-PRO$0.87▲ 2.6%

MINIMAX-M3$1.20▼ 6.1%

QWEN3.5-PLUS-2$1.80▼ 6.0%

NOVA-2-LITE-V1$2.50▼ 31.4%

GEMINI-2.5-FLA$2.50▼ 9.5%

GROK-4.3$2.50▼ 8.7%

QWEN3.6-PLUS$1.95▼ 31.6%

NEMOTRON-3-ULT$2.50▼ 25.9%

QWEN3.7-PLUS$1.60▼ 21.4%

MINIMAX-M1$2.20▲ 11.9%

PALMYRA-X5$6.00▼ 26.9%

QWEN3.7-MAX$3.75▼ 3.1%

GEMINI-3.5-FLA$9.00▼ 8.3%

GEMINI-2.5-PRO$10.00▲ 7.2%

GPT-5.4-NANO$1.25▲ 10.0%

NOVA-LITE-V1$0.24▼ 28.9%

KIMI-K2.5$1.90▲ 9.8%

MINISTRAL-14B-$0.20▼ 17.7%

R1 0528 vs R1 Distill Llama 70B

Higher efficiency

Metric

R1 0528 R1 Distill Llama 70B

Input price

$0.50

$0.70

Output price

$2.15

$0.80

Context window

164K

131K

Throughput

162 tok/s

152 tok/s

Availability

99.9%

100.0%

Cost / task

$0.002

$0.002

Efficiency score

89

89

Estimated monthly cost by workload

Metric

DEEPSEEK-R1-05

DEEPSEEK-R1-DI

Chat assistant

$408.00

$306.00

RAG / long context

$793.50

$912.00

Agent / tool use

$1,134

$792.00

Efficiency score: R1 0528

Across price, speed and reliability, R1 0528 offers the stronger overall balance for most workloads — but the right pick depends on your exact mix of input, output and latency needs.

Figures are illustrative demo data, not financial advice.

Frequently asked questions

Is R1 0528 or R1 Distill Llama 70B cheaper?+

R1 0528 has the lower input price — $0.50 vs $0.70 per 1M tokens — so for most blended workloads it is the more cost-effective of the two. Figures are illustrative demo data.

Which should I choose, R1 0528 or R1 Distill Llama 70B?+

Across price, speed and reliability, R1 0528 offers the stronger overall balance for most workloads — but the right pick depends on your exact mix of input, output and latency needs.