OpenAI API Pricing (2026)
| Model | Input $/1M tokens | Output $/1M tokens | Best for |
|---|---|---|---|
| GPT-4o | $2.50 | $10.00 | High-quality reasoning, vision |
| GPT-4o-mini | $0.15 | $0.60 | High-volume, cost-sensitive |
| GPT-4-turbo | $10.00 | $30.00 | Legacy high-quality tasks |
| GPT-3.5-turbo | $0.50 | $1.50 | Simple tasks, fast responses |
| o1-preview | $15.00 | $60.00 | Complex multi-step reasoning |
| o1-mini | $3.00 | $12.00 | Reasoning at moderate cost |
GPT-4o vs GPT-4o-mini: Which Should You Use?
GPT-4o-mini is 16× cheaper than GPT-4o on input and 17× on output. For most production use cases — classification, summarization, structured extraction, basic Q&A — GPT-4o-mini matches or exceeds GPT-4o quality at a fraction of the cost. Use GPT-4o for complex reasoning, coding, multimodal (vision) tasks.
Cost Optimization Tips
Prompt caching: OpenAI caches repeated prompt prefixes, reducing input costs by up to 50% for requests sharing a system prompt. Batching: The Batch API reduces costs by 50% with a 24-hour turnaround. Context management: Truncate conversation history — every message in history is billed as input tokens.
FAQ
How does OpenAI count tokens?
OpenAI uses the tiktoken tokenizer. Roughly 1 token = 4 characters or 0.75 words in English. You can count tokens exactly using our token counter tool or the tiktoken Python library: tiktoken.encoding_for_model("gpt-4o").
Are there free tiers for OpenAI API?
OpenAI doesn't offer a permanent free tier. New accounts receive a small credit ($5-18 depending on region) that expires after a few months. After that, all usage is billed per token. Unlike the ChatGPT web product, the API requires a paid plan.
Does the system prompt count as input tokens?
Yes. The system prompt is sent with every API call and billed as input tokens each time. A 1,000-token system prompt on 10,000 daily requests = 10M extra input tokens per day. Keeping system prompts concise or using prompt caching significantly reduces costs.