LIVE

DEEPSEEK-V4-FL$0.20▼ 26.3%

DEEPSEEK-V4-PR$0.87▲ 4.6%

QWEN3.6-FLASH$1.13▼ 24.9%

NEMOTRON-3-SUP$0.45▼ 23.9%

LLAMA-4-MAVERI$0.60▼ 11.9%

LLAMA-4-SCOUT$0.30▲ 5.8%

GEMINI-3.1-FLA$1.50▼ 10.0%

GEMINI-2.5-FLA$0.40▼ 0.8%

MINIMAX-01$1.10▲ 8.0%

MIMO-V2.5$0.28▼ 5.8%

MIMO-V2.5-PRO$0.87▲ 2.6%

MINIMAX-M3$1.20▼ 6.1%

QWEN3.5-PLUS-2$1.80▼ 6.0%

NOVA-2-LITE-V1$2.50▼ 31.4%

GEMINI-2.5-FLA$2.50▼ 9.5%

GROK-4.3$2.50▼ 8.7%

QWEN3.6-PLUS$1.95▼ 31.6%

NEMOTRON-3-ULT$2.50▼ 25.9%

QWEN3.7-PLUS$1.60▼ 21.4%

MINIMAX-M1$2.20▲ 11.9%

PALMYRA-X5$6.00▼ 26.9%

QWEN3.7-MAX$3.75▼ 3.1%

GEMINI-3.5-FLA$9.00▼ 8.3%

GEMINI-2.5-PRO$10.00▲ 7.2%

GPT-5.4-NANO$1.25▲ 10.0%

NOVA-LITE-V1$0.24▼ 28.9%

KIMI-K2.5$1.90▲ 9.8%

MINISTRAL-14B-$0.20▼ 17.7%

DEEPSEEK-V4-FL$0.20▼ 26.3%

DEEPSEEK-V4-PR$0.87▲ 4.6%

QWEN3.6-FLASH$1.13▼ 24.9%

NEMOTRON-3-SUP$0.45▼ 23.9%

LLAMA-4-MAVERI$0.60▼ 11.9%

LLAMA-4-SCOUT$0.30▲ 5.8%

GEMINI-3.1-FLA$1.50▼ 10.0%

GEMINI-2.5-FLA$0.40▼ 0.8%

MINIMAX-01$1.10▲ 8.0%

MIMO-V2.5$0.28▼ 5.8%

MIMO-V2.5-PRO$0.87▲ 2.6%

MINIMAX-M3$1.20▼ 6.1%

QWEN3.5-PLUS-2$1.80▼ 6.0%

NOVA-2-LITE-V1$2.50▼ 31.4%

GEMINI-2.5-FLA$2.50▼ 9.5%

GROK-4.3$2.50▼ 8.7%

QWEN3.6-PLUS$1.95▼ 31.6%

NEMOTRON-3-ULT$2.50▼ 25.9%

QWEN3.7-PLUS$1.60▼ 21.4%

MINIMAX-M1$2.20▲ 11.9%

PALMYRA-X5$6.00▼ 26.9%

QWEN3.7-MAX$3.75▼ 3.1%

GEMINI-3.5-FLA$9.00▼ 8.3%

GEMINI-2.5-PRO$10.00▲ 7.2%

GPT-5.4-NANO$1.25▲ 10.0%

NOVA-LITE-V1$0.24▼ 28.9%

KIMI-K2.5$1.90▲ 9.8%

MINISTRAL-14B-$0.20▼ 17.7%

Granite 4.1 8B vs GLM 4 32B

Higher efficiency

Metric

Granite 4.1 8B GLM 4 32B

Input price

$0.05

$0.10

Output price

$0.10

$0.10

Context window

131K

128K

Throughput

157 tok/s

162 tok/s

Availability

100.0%

100.0%

Cost / task

$0.000

$0.000

Efficiency score

89

89

Estimated monthly cost by workload

Metric

GRANITE-4.1-8B

GLM-4-32B

Chat assistant

$27.00

$42.00

RAG / long context

$69.00

$129.00

Agent / tool use

$72.00

$108.00

Efficiency score: Granite 4.1 8B

Across price, speed and reliability, Granite 4.1 8B offers the stronger overall balance for most workloads — but the right pick depends on your exact mix of input, output and latency needs.

Figures are illustrative demo data, not financial advice.

Frequently asked questions

Is Granite 4.1 8B or GLM 4 32B cheaper?+

Granite 4.1 8B has the lower input price — $0.05 vs $0.10 per 1M tokens — so for most blended workloads it is the more cost-effective of the two. Figures are illustrative demo data.

Which should I choose, Granite 4.1 8B or GLM 4 32B?+

Across price, speed and reliability, Granite 4.1 8B offers the stronger overall balance for most workloads — but the right pick depends on your exact mix of input, output and latency needs.