Full LLM API Price Comparison

Side-by-side pricing for every major AI model. Sort by cost, filter by provider, adjust for your usage.

Model Input $/1M Output $/1M Your Cost (30d) Relative
🧮 Main Calculator 🤖 OpenAI Deep Dive 🔷 Claude Deep Dive 🔢 Token Counter

Which LLM API is Cheapest?

For high-volume production workloads, Gemini 1.5 Flash ($0.075/$0.30 per 1M) and Mistral Small ($0.10/$0.30) are the cheapest capable models. GPT-4o-mini ($0.15/$0.60) offers excellent quality for the price from OpenAI. Claude 3.5 Haiku ($0.80/$4.00) is the cheapest Anthropic option.

Price Trends

LLM prices have dropped ~80% per year since 2023. GPT-4-level quality that cost $100/1M tokens in 2023 is now available for $2-3/1M. Expect continued price cuts, especially as Gemini and open-source models (Llama) put competitive pressure on OpenAI and Anthropic.

Total Cost of Ownership

Per-token price is not the only cost. Consider: latency (slower models increase infrastructure costs), accuracy (cheap models may need more retries), rate limits (higher tiers cost more), and network egress. A model that's 2× cheaper but requires 1.5× more requests nets you only 25% savings.

FAQ

Is GPT-4o or Claude 3.5 Sonnet better value?

They're priced similarly ($2.50 vs $3.00 input, $10 vs $15 output). GPT-4o has a slight edge on coding and multimodal tasks; Claude 3.5 Sonnet is preferred for long-document analysis and nuanced writing. For most applications, performance is comparable — test on your specific task before committing.

Are open-source models like Llama free to use?

The weights are free to download, but you pay for compute to run them. Via cloud providers (Together.ai, Fireworks, Groq), Llama 3.1 70B costs ~$0.59/$0.79 per 1M tokens. Running it yourself requires a multi-GPU server (~$1-3/hour on AWS). At <10M tokens/day, hosted APIs are almost always cheaper than self-hosting.

Why does this table show different costs than provider websites?

We show the standard API pricing without any discounts, caching, or batch pricing applied. Enterprise customers, high-volume users, and batch API users pay less. Always verify current prices on official provider pages as they change frequently.