Understanding LLM Tokens
LLMs don't process text character by character — they use tokens, which are chunks of text. The exact size varies: common words are usually 1 token, rare or long words might be 2-4 tokens, and special characters/code often tokenize differently.
Token Estimation Rules of Thumb
English prose: ~4 characters per token or ~0.75 words per token
Code: ~3-4 characters per token (symbols and brackets are often individual tokens)
Numbers: Each digit is often 1 token; large numbers can be 2-4 tokens
Non-English: CJK characters are often 2-3 tokens each; European languages similar to English
Context Window Limits
Most current models support 128K-200K token context windows. Claude supports 200K. GPT-4o supports 128K. You pay for every token in your context window on each request — long conversations get expensive quickly.
Reducing Token Usage
Practical optimizations: (1) Be concise in system prompts — every word costs money at scale. (2) Use structured output formats (JSON) rather than verbose descriptions. (3) Truncate conversation history — only keep the last N turns. (4) Compress retrieved documents before including them in context.
FAQ
How accurate is this token counter?
This tool estimates tokens using a 4-characters-per-token heuristic, which is accurate to within ~10% for typical English text. For exact counts, use the official tokenizers: tiktoken for OpenAI models, anthropic.tokenizer for Claude. The heuristic is less accurate for code, non-English text, and text with many special characters.
Does context window size affect cost?
Yes. Every token in your context window (system prompt + conversation history + current message) is billed as input tokens on every request. A 200-token system prompt costs 200 tokens × number of daily requests. For 10,000 daily requests, that's 2M extra input tokens per day.
What's the maximum context window for each model?
GPT-4o: 128,000 tokens. GPT-4o-mini: 128,000 tokens. Claude 3.5 Sonnet: 200,000 tokens. Gemini 1.5 Pro: 2,000,000 tokens. Gemini 1.5 Flash: 1,000,000 tokens. Mistral Large: 32,000 tokens. Llama 3.1 70B: 128,000 tokens.