Reference guide for all supported LLM providers and their current rates.
The Token Validator uses local-first client-side tokenization. For maximum speed and offline support, all costs are calculated using the cl100k_base tokenizer as a high-fidelity proxy. While most providers (OpenAI, Anthropic, DeepSeek, etc.) use this or very similar algorithms, these figures remain estimates for budget planning.
| Provider | Primary Model Tiers | Pricing Page |
|---|---|---|
| Anthropic | Claude 4.5, 3.7.5 Sonnet, 3.5 Haiku | Official Pricing ↗ |
| OpenAI | GPT-5.4, GPT-4o, o1-preview | Official Pricing ↗ |
| DeepSeek | V4, Reasoner, Chat | Official Pricing ↗ |
| xAI (Grok) | Grok-4.20, Grok-3-mini | Official Pricing ↗ |
| Google Gemini | Gemini 2.5 Pro, 2.5 Flash | Official Pricing ↗ |
| OpenRouter Aggregator | Meta, Qwen, Mistral, 01.AI | Official Pricing ↗ |
| AWS Bedrock | Claude 3.x, Llama 4.0, Titan | Official Pricing ↗ |
| Vertex AI | Gemini 1.5, 2.0, 2.5 | Official Pricing ↗ |
| Fireworks AI | Llama 4.0, Qwen 2.5 (Fast) | Official Pricing ↗ |
| Together AI | Llama 3.1, Mixtral, Qwen | Official Pricing ↗ |
| Cerebras Ultra-Fast | Llama 3.1 70B, 8B | Official Pricing ↗ |
| Perplexity | Sonar Pro, Sonar Base | Official Pricing ↗ |
| Groq | Llama 3.3, Mixtral 8x7b | Official Pricing ↗ |
| Mistral | Mistral Large 2, Pixtral | Official Pricing ↗ |
| Cohere | Command R+, Command R7 | Official Pricing ↗ |
| Alibaba DashScope | Qwen-Max, Qwen-Plus-3 | Official Pricing ↗ |
| Reka AI | Reka Core, Reka Flash | Official Pricing ↗ |
| Upstage | Solar-Pro-v2 | Official Pricing ↗ |
| Lambda Labs | Hermes 3, Llama 3.x | Official Pricing ↗ |
| Meta (Llama) | Llama 4.0, 3.1 | Official Pricing ↗ |
| NVIDIA | Nemotron 4, NVLM 1.0 | Official Pricing ↗ |