Anthropic Claude API Pricing (March 2026): Every Model, Every Tier

Opus 4.6 costs $5/$25 per million tokens. Sonnet 4.6 costs $3/$15. Haiku 3 costs $0.25/$1.25. Full breakdown of batch, caching, long-context, and fast mode pricing.

March 5, 2026 · 1 min read

Current Model Pricing (March 2026)

All prices per million tokens (MTok). Verified against Anthropic's official pricing page on March 5, 2026.

$5 / $25
Opus 4.6 (input / output per MTok)
$3 / $15
Sonnet 4.6 (input / output per MTok)
$1 / $5
Haiku 4.5 (input / output per MTok)
$0.25 / $1.25
Haiku 3 (input / output per MTok)
ModelInput (per MTok)Output (per MTok)Status
Claude Opus 4.6$5.00$25.00Current flagship
Claude Opus 4.5$5.00$25.00Active
Claude Opus 4.1$15.00$75.00Active (legacy pricing)
Claude Sonnet 4.6$3.00$15.00Current mid-tier
Claude Sonnet 4.5$3.00$15.00Active
Claude Sonnet 4$3.00$15.00Active
Claude Haiku 4.5$1.00$5.00Current budget
Claude Haiku 3.5$0.80$4.00Active
Claude Haiku 3$0.25$1.25Active (cheapest)

Legacy Pricing Gap

Opus 4.1 and earlier Opus models cost $15/$75 per MTok, which is 3x the input and 3x the output price of Opus 4.6. If you are still using Opus 4.1 or Opus 4 for production workloads, switching to Opus 4.6 cuts your bill by 67% with no capability regression.

The Sonnet line has been price-stable since Claude 3: every version from Sonnet 3.7 through Sonnet 4.6 costs $3/$15. Haiku pricing varies more. Haiku 3 at $0.25/$1.25 is 4x cheaper than Haiku 4.5 at $1/$5, but Haiku 4.5 handles substantially more complex tasks.

Batch Processing Discount

The Batch API provides a flat 50% discount on all token costs. You submit requests asynchronously, and Anthropic processes them within a 24-hour window.

ModelBatch Input (per MTok)Batch Output (per MTok)
Claude Opus 4.6$2.50$12.50
Claude Sonnet 4.6$1.50$7.50
Claude Haiku 4.5$0.50$2.50
Claude Haiku 3$0.125$0.625

Haiku 3 via Batch API costs $0.125 per million input tokens. That is 40x cheaper than standard Opus 4.6 input pricing. For workloads that tolerate 24-hour turnaround (data processing, bulk classification, offline analysis), batch Haiku is the cheapest way to use Claude.

Prompt Caching

Prompt caching stores repeated portions of your prompt so subsequent requests read from cache at 10% of the base input price. Two cache durations exist.

OperationMultiplierDurationBreak-even
5-minute cache write1.25x base input5 minutes1 cache hit
1-hour cache write2x base input1 hour2 cache hits
Cache hit (read)0.1x base inputSame as writeImmediate savings

Concrete example: a 10,000-token system prompt on Opus 4.6. Standard cost per request: $0.05 input. With 5-minute caching, the first request costs $0.0625 (write), and every subsequent request within 5 minutes costs $0.005 (hit). After one cache hit, you are saving money. After 10 requests, you have spent $0.1075 instead of $0.50.

Automatic Caching

Add a single cache_control field at the top level of your request, and Anthropic automatically manages cache breakpoints as conversations grow. For fine-grained control, place cache_control on individual content blocks.

Long Context Pricing

Claude Opus 4.6, Sonnet 4.6, Sonnet 4.5, and Sonnet 4 support a 1M token context window (beta, Tier 4+ organizations). Requests exceeding 200K input tokens are charged at premium rates.

ModelStandard (≤200K)Long Context (>200K)
Opus 4.6 Input$5.00 / MTok$10.00 / MTok
Opus 4.6 Output$25.00 / MTok$37.50 / MTok
Sonnet 4.6/4.5/4 Input$3.00 / MTok$6.00 / MTok
Sonnet 4.6/4.5/4 Output$15.00 / MTok$22.50 / MTok

The 200K threshold is based on total input tokens, including cache reads and writes. Output token count does not affect which tier applies. If your total input hits 200,001 tokens, the entire request uses premium pricing, not just the overflow.

Check Your API Response

Sum input_tokens + cache_creation_input_tokens + cache_read_input_tokens in the response usage object. If the total exceeds 200,000, you were billed at long-context rates.

Fast Mode (Opus 4.6 Only)

$30 / MTok
Fast Mode Input (6x standard)
$150 / MTok
Fast Mode Output (6x standard)

Fast mode is a research preview that provides significantly faster output from Opus 4.6 at 6x the standard rate. The tradeoff is straightforward: $30/$150 instead of $5/$25. Fast mode includes the full 1M context window with no additional long-context surcharge.

Fast mode stacks with prompt caching and data residency multipliers, but is not available with the Batch API. If you need both speed and cost efficiency, standard Sonnet 4.6 at $3/$15 may be a better fit than Opus fast mode at $30/$150.

Third-Party Platform Pricing

Claude models are available on AWS Bedrock, Google Vertex AI, and Microsoft Foundry. Starting with Claude Sonnet 4.5 and Haiku 4.5, these platforms offer two endpoint types:

Endpoint TypeRoutingPrice Premium
GlobalDynamic routing across regionsStandard pricing
RegionalData stays in specific region+10% premium

The Claude API (direct from Anthropic) is global-only and uses standard pricing. The 10% regional premium only applies on third-party platforms. Earlier models (Sonnet 4, Opus 4, and prior) keep their existing pricing structure on all platforms.

Cost Optimization Guide

The four discount mechanisms stack multiplicatively. Here is how to layer them for maximum savings.

Batch + Prompt Caching on Haiku 3: $0.0125 per million cached input tokens.

ConfigurationOpus 4.6Sonnet 4.6Haiku 3
Standard$5.00$3.00$0.25
Batch only$2.50$1.50$0.125
Cache hit only$0.50$0.30$0.025
Batch + cache hit$0.25$0.15$0.0125

Opus 4.6 with batch processing and prompt caching costs $0.25 per million cached input tokens. That is the same base price as standard Haiku 3 input. For offline workloads with repeated context (document analysis, code review across a repository), this combination makes Opus-level reasoning accessible at Haiku-level pricing.

Right-size Your Model

Use Haiku for classification, extraction, and simple generation. Use Sonnet for code generation, summarization, and multi-step reasoning. Use Opus only for tasks where Sonnet measurably underperforms. Morph routes between models automatically based on task complexity, so you pay Haiku prices for easy tasks and Opus prices only when the task demands it.

FAQ

How much does the Claude API cost?

Opus 4.6: $5/$25 per MTok. Sonnet 4.6: $3/$15. Haiku 4.5: $1/$5. Haiku 3: $0.25/$1.25. All prices are per million tokens. Batch processing gives 50% off across all models.

What is the cheapest Claude API model?

Claude Haiku 3 at $0.25 input / $1.25 output per million tokens. With batch processing, that drops to $0.125/$0.625. Haiku 3 handles classification, extraction, and simple generation well.

How does prompt caching save money?

Cache hits cost 10% of the standard input price. A 5-minute cache write costs 1.25x, so after just one cache hit you break even. For applications with consistent system prompts, caching cuts input costs by up to 90%.

What is Claude fast mode?

Fast mode for Opus 4.6 costs $30/$150 per MTok (6x standard). It provides faster output and includes the full 1M context window with no long-context surcharge. Available as a research preview.

Does Claude charge more for long context?

Yes. Requests exceeding 200K input tokens use premium pricing: 2x input, 1.5x output. This applies to Opus 4.6, Sonnet 4.6, 4.5, and 4. Requests under 200K use standard rates even with 1M context enabled.

Is Claude cheaper than OpenAI?

Per-token, OpenAI's GPT-5 ($1.25/$10) and o3 ($2/$8) are cheaper than Claude Sonnet 4.6 ($3/$15) and Opus 4.6 ($5/$25). But effective cost depends on tokens consumed per task, and Claude often completes coding tasks in fewer turns. See our full pricing comparison.

Do discounts stack?

Yes. Batch (50% off), prompt caching (90% off cache hits), long-context premium, and data residency all stack multiplicatively. The cheapest possible Opus input cost is $0.25 per MTok (batch + cache hit).

What are token limits for each model?

Opus 4.6 and Sonnet 4.6: 200K default context, 1M beta. Haiku 4.5: 200K. Haiku 3: 200K. Output limits vary by model. Maximum output for Opus 4.6 is 128K tokens per response.

Route to the right Claude model automatically

Morph selects between Opus, Sonnet, and Haiku based on task complexity. You pay Haiku prices for easy tasks, Opus prices only when the problem demands it.