Calculate AI API Costs
Across All Major LLMs

Enter your usage — get instant monthly cost estimates for GPT-4o, Claude, Gemini, Mistral, and more.

✓ GPT-4o & GPT-4o-mini ✓ Claude 3.5 & Haiku ✓ Gemini 1.5 Pro & Flash ✓ Mistral & Llama

Your Usage

Cost Comparison — All Models

Model Input $/1M Output $/1M Cost for your usage Relative cost
🤖 OpenAI Calculator 🔷 Claude Pricing 📊 Full Comparison 🔢 Token Counter

How AI API Pricing Works

Every LLM API charges per token — a unit roughly equal to 4 characters or 0.75 words. You pay separately for input tokens (your prompt + context) and output tokens (the model's reply). Prices are quoted per 1 million tokens.

Current Pricing Reference (June 2026)

ModelInput $/1MOutput $/1M
GPT-4o$2.50$10.00
GPT-4o-mini$0.15$0.60
Claude 3.5 Sonnet$3.00$15.00
Claude 3.5 Haiku$0.80$4.00
Gemini 1.5 Pro$1.25$5.00
Gemini 1.5 Flash$0.075$0.30
Mistral Large$2.00$6.00
Mistral Small$0.10$0.30

Input vs Output: Which Costs More?

Output tokens are always 2–5× more expensive than input tokens. For chatbots and agents, outputs dominate cost. For classification or extraction tasks where you pass long documents and get short answers, input tokens dominate. This calculator accounts for both separately.

Reducing API Costs

The biggest levers: (1) Use a smaller model for simple tasks — GPT-4o-mini at $0.15/1M vs GPT-4o at $2.50/1M is a 16× cost difference with similar quality for most tasks. (2) Shorten your system prompts — every token in every request costs money. (3) Use caching — Claude and OpenAI both offer prompt caching that can reduce costs 50-90% for repeated context.

Frequently Asked Questions

How is LLM API cost calculated?

Cost = (input_tokens × input_price_per_1M / 1,000,000) + (output_tokens × output_price_per_1M / 1,000,000). This calculator multiplies by your daily request count and the selected period (day/month/year).

Which LLM API is cheapest in 2026?

For high-volume applications: Gemini 1.5 Flash ($0.075 input / $0.30 output per 1M) and Mistral Small ($0.10/$0.30) are the cheapest capable models. For quality-critical use cases: GPT-4o-mini offers excellent quality at $0.15/$0.60 per 1M tokens.

How many tokens is a typical ChatGPT message?

A short user message: ~50-150 tokens. A detailed prompt with instructions: 300-800 tokens. A long document for analysis: 2,000-8,000 tokens. A typical response: 200-500 tokens. Use our token counter to measure your specific prompts.

Does context window size affect cost?

Yes. Every token in the conversation history counts as input tokens on each request. A chatbot that carries 10 messages of context before each reply might send 2,000 tokens of history + 100 new tokens — the history dominates the cost. This is why context management is critical for production apps.

Are these prices up to date?

We update pricing when providers announce changes. LLM prices have been dropping roughly 80% per year since 2023. Check provider pricing pages for the absolute latest: OpenAI, Anthropic, Google.