LIVE
BODYBUILDER$-1000000.00 5.4%
MINIMAX-M2.7$1.20 7.2%
QWEN-PLUS-2025$0.78 28.3%
DEEPSEEK-AI$0.00 27.0%
GPT-4.1-NANO$0.40 13.0%
GLM-4-LONG$0.00 14.1%
GROK-4.1-FAST$0.50 16.4%
GROK-4-FAST-RE$0.50 17.1%
GOOGLE$0.00 13.0%
LLAMA-4-MAVERI$0.60 11.9%
GROK-4-FAST-NO$0.50 14.3%
GROK-4-FAST$0.50 6.0%
GROK-4-1-FAST-$0.50 31.2%
MINIMAX-M2.5$0.99 10.7%
GOOGLE$1.50 3.4%
GOOGLE$0.00 27.8%
XAI$0.00 4.5%
-CF$0.00 9.7%
GEMINI-3.1-FLA$1.50 9.4%
MINIMAX-M2.7$0.00 9.9%
GPT-4.1-MINI-2$1.60 31.8%
DEEPSEEK$0.28 2.2%
LYRIA-3-PRO-PR$0.00 29.4%
LYRIA-3-CLIP-P$0.00 22.9%
MINIMAX-M2.1$0.95 2.6%
MINIMAX-01$1.10 8.0%
MINIMAX-M3$1.20 6.1%
GEMINI-2.5-FLA$0.40 28.8%
BODYBUILDER$-1000000.00 5.4%
MINIMAX-M2.7$1.20 7.2%
QWEN-PLUS-2025$0.78 28.3%
DEEPSEEK-AI$0.00 27.0%
GPT-4.1-NANO$0.40 13.0%
GLM-4-LONG$0.00 14.1%
GROK-4.1-FAST$0.50 16.4%
GROK-4-FAST-RE$0.50 17.1%
GOOGLE$0.00 13.0%
LLAMA-4-MAVERI$0.60 11.9%
GROK-4-FAST-NO$0.50 14.3%
GROK-4-FAST$0.50 6.0%
GROK-4-1-FAST-$0.50 31.2%
MINIMAX-M2.5$0.99 10.7%
GOOGLE$1.50 3.4%
GOOGLE$0.00 27.8%
XAI$0.00 4.5%
-CF$0.00 9.7%
GEMINI-3.1-FLA$1.50 9.4%
MINIMAX-M2.7$0.00 9.9%
GPT-4.1-MINI-2$1.60 31.8%
DEEPSEEK$0.28 2.2%
LYRIA-3-PRO-PR$0.00 29.4%
LYRIA-3-CLIP-P$0.00 22.9%
MINIMAX-M2.1$0.95 2.6%
MINIMAX-01$1.10 8.0%
MINIMAX-M3$1.20 6.1%
GEMINI-2.5-FLA$0.40 28.8%

@cf/meta/llama-3.3-70b-instruct-fp8-fast vs granite-4.1-8b

Input price
$0.00
$0.05
Output price
$0.00
$0.10
Context window
128K
131K
Throughput
138 tok/s
157 tok/s
Availability
97.9%
100.0%
Cost / task
$0.000
$0.000
Efficiency score
89
89

Estimated monthly cost by workload

Metric
-CF
GRANITE-4.1-8B
Chat assistant
$0.00
$27.00
RAG / long context
$0.00
$69.00
Agent / tool use
$0.00
$72.00

Efficiency score: @cf/meta/llama-3.3-70b-instruct-fp8-fast

Across price, speed and reliability, @cf/meta/llama-3.3-70b-instruct-fp8-fast offers the stronger overall balance for most workloads — but the right pick depends on your exact mix of input, output and latency needs.

Figures are illustrative demo data, not financial advice.

Frequently asked questions

Is @cf/meta/llama-3.3-70b-instruct-fp8-fast or granite-4.1-8b cheaper?+

@cf/meta/llama-3.3-70b-instruct-fp8-fast has the lower input price — $0.00 vs $0.05 per 1M tokens — so for most blended workloads it is the more cost-effective of the two. Figures are illustrative demo data.

Which should I choose, @cf/meta/llama-3.3-70b-instruct-fp8-fast or granite-4.1-8b?+

Across price, speed and reliability, @cf/meta/llama-3.3-70b-instruct-fp8-fast offers the stronger overall balance for most workloads — but the right pick depends on your exact mix of input, output and latency needs.