LLM API Cost Calculator

Calculate and compare the cost of using large language model APIs. Enter your token usage, request volume, and caching preferences to estimate daily, monthly, and annual spend across major providers.

Provider

Model

Avg Input Tokens per Request

Avg Output Tokens per Request

Requests per Day

Cache Hit Rate0%

Use Batch API

Cost Estimate — GPT-4.1

Cost per Request

$0.006000

Daily Cost

$0.60

Monthly Cost

$18.00

Annual Cost

$219.00

Requests per $1

166

Provider Comparison (Flagship Model)

Monthly cost for same usage across each provider's top model

DeepSeek$1.47/mo

Mistral$3.75/mo

xAI$15.00/mo

OpenAI (selected)$18.00/mo

Google$18.75/mo

Anthropic$52.50/mo

Cost Breakdown

Input vs output token costs per request

Input (33%)Output (67%)

All OpenAI Models

Model	Input/1M	Output/1M	Context	Monthly
GPT-4.1selected	$2.00	$8.00	1M	$18.00
GPT-4o	$2.50	$10.00	128K	$22.50
o3	$2.00	$8.00	200K	$18.00
o4-mini	$0.55	$2.20	200K	$4.95
GPT-4.1-mini	$0.40	$1.60	1M	$3.60
GPT-4.1-nano	$0.10	$0.40	1M	$0.90

Understanding LLM API Pricing

LLM API costs are based on token usage — the number of input tokens (your prompts) and output tokens (model responses) per request. Prices vary significantly across providers and models, with larger models costing more per token but often producing higher quality results. Caching and batch processing can substantially reduce costs for high-volume applications.

Provider Pricing Pages

Pricing data is approximate and may not reflect the latest changes. Always verify current rates on provider pricing pages before making purchasing decisions.

Frequently Asked Questions

Input tokens are the text you send to the API (your prompt), while output tokens are the text the model generates in response. Pricing differs for each direction — output tokens are typically 3-5x more expensive than input tokens.

Prompt caching stores frequently used prompt prefixes so you don't pay full price to resend them. Cached input tokens cost 50-90% less than regular input tokens, making it ideal for applications with repeated system prompts or context.

The Batch API is ideal for non-time-sensitive workloads like data processing, content generation, or evaluation tasks. It typically offers a 50% discount in exchange for longer processing times (up to 24 hours).

A rough rule of thumb is that 1 token is approximately 4 characters or 0.75 words in English. A 500-word prompt is roughly 670 tokens. Most API providers offer tokenizer tools to get exact counts.

Related Calculators