Developers building on the Anthropic API face the same decision on every new feature: Sonnet or Haiku? Sonnet 4.5 is 3.75x more expensive than Haiku 4.5, but handles tasks Haiku cannot. Most teams hardcode one model and accept the tradeoff. The teams with the best cost-to-quality ratio use both and route each prompt to the right one automatically.
The Models
Claude Sonnet 4.5 and Haiku 4.5 are the two most-used Claude models in production. They serve different roles in the Anthropic lineup.
Sonnet 4.5 is the workhorse. It handles multi-step reasoning, code generation, complex analysis, and tasks that require holding multiple constraints in working memory. Most teams treat Sonnet as their default model because it covers the widest range of use cases at acceptable cost.
Haiku 4.5 is the speed model. It processes simple tasks at roughly 3x the speed of Sonnet at 3.75x lower cost. For classification, extraction, formatting, and routing decisions, Haiku produces equivalent results to Sonnet because these tasks don't exercise the reasoning capacity that separates the models.
Opus 4.5 sits at the top of the lineup. It exists for problems that Sonnet gets wrong: novel problem solving, architecture decisions that require reasoning through many constraints simultaneously, and multi-file refactoring across tightly coupled systems. Most teams never route more than 5-10% of their prompts to Opus.
The model hierarchy
Think of the Claude models as a cost-quality ladder. Haiku covers the bottom 60% of tasks (simple, fast, cheap). Sonnet covers the next 30% (complex, balanced). Opus covers the top 10% (hardest problems only). Picking one model for everything means overpaying on easy tasks or underperforming on hard ones.
Pricing Comparison
The pricing difference between models is consistent: each tier costs roughly 3.75-5x more than the one below it.
| Model | Input price | Output price | Relative cost |
|---|---|---|---|
| Haiku 4.5 | $0.80 | $4.00 | 1x (baseline) |
| Sonnet 4.5 | $3.00 | $15.00 | 3.75x Haiku |
| Opus 4.5 | $15.00 | $75.00 | 18.75x Haiku / 5x Sonnet |
For a concrete example: a coding agent session with 50 prompts averaging 2,000 input tokens and 1,000 output tokens each. Running everything on Sonnet costs approximately $0.30 for input and $0.75 for output, totaling $1.05. The same session on Haiku costs $0.08 input and $0.20 output, totaling $0.28. That is a 73% reduction, but only if Haiku can handle every prompt without quality loss. It can't. Some prompts need Sonnet. The question is which ones.
When to Use Haiku
Haiku wins on tasks where the bottleneck is speed or cost, not reasoning depth. These tasks have clear inputs, well-defined outputs, and don't require the model to reason through multiple steps or hold complex state.
Classification
Sentiment analysis, intent detection, category assignment. Haiku classifies as accurately as Sonnet because these tasks depend on pattern matching, not reasoning chains.
Simple extraction
Pull a name, date, email, or phone number from text. Extract a JSON field from a document. Parse structured data from semi-structured input.
Routing decisions
Yes/no, pick-one, binary classification. 'Is this a support ticket or a sales inquiry?' Haiku answers these instantly at a fraction of Sonnet's cost.
Formatting and templating
Convert JSON to markdown. Apply a template to structured data. Reformat output for a different consumer. These are mechanical transformations, not reasoning tasks.
High-volume pipelines
Processing 10,000 documents per hour for metadata extraction. At $0.80/M input tokens, Haiku makes high-volume processing economically viable.
Latency-sensitive paths
User-facing responses where sub-second latency matters. Haiku's speed advantage over Sonnet is roughly 3x, which compounds in multi-turn interactions.
When to Use Sonnet
Sonnet is the right choice when the task requires the model to reason through multiple steps, hold context across a complex problem, or produce output where errors have real consequences.
Multi-step reasoning
Problems that require chaining 3+ logical steps. 'Read this code, understand the data flow, identify the race condition, and propose a fix.' Haiku loses coherence on chains this long.
Code generation and review
Writing functions, classes, or modules from a spec. Reviewing code for bugs, security issues, or architecture problems. Sonnet's code output is measurably more correct than Haiku's.
Complex analysis
Analyzing a codebase to understand architecture. Comparing multiple approaches with tradeoffs. Synthesizing information from multiple sources into a coherent recommendation.
Creative writing with nuance
Documentation that needs to be precise and clear. API descriptions. Technical blog posts. Sonnet produces more accurate, better-structured prose than Haiku on complex topics.
Error-expensive tasks
Any task where a wrong answer costs more than the price difference between models. A billing calculation, a security review, a database migration plan. Spend the extra tokens on Sonnet.
Long-context reasoning
Tasks that require attending to information spread across a large input. Haiku handles the context window mechanically but doesn't reason about distant relationships as well as Sonnet.
When to Use Opus
Opus 4.5 costs 5x Sonnet and 18.75x Haiku. It is not a default model for anything. It is the model you reach for when Sonnet fails or when the stakes justify the cost premium.
Novel problem solving
Problems the model hasn't seen patterns for. Unusual edge cases, domain-specific reasoning, or tasks that require genuine creative problem solving rather than pattern application.
Multi-file refactoring
Refactoring across 10+ tightly coupled files where changes in one file cascade to others. Opus holds the full dependency graph in working memory better than Sonnet.
Architecture design
Designing a system from scratch with multiple competing constraints: performance, cost, maintainability, security. Opus reasons through tradeoffs more thoroughly.
When Sonnet gets it wrong
The most practical trigger. If Sonnet produces an incorrect or shallow result, retry with Opus. This is cheaper than always using Opus and catches the cases where Sonnet's reasoning falls short.
The Real Question: Why Choose Manually?
A coding agent session might send 50 prompts. Roughly 30 are simple: add a comment, rename a variable, run a test, format output. About 15 are medium: write a function, fix a bug, refactor a method. And 5 are hard: design a system, debug a race condition, plan a migration.
Picking the model per-prompt manually is impractical. No developer wants to evaluate each prompt's complexity and switch models mid-session. The result is that most teams hardcode one model.
Hardcoding Sonnet wastes money. You pay 3.75x the Haiku price on 60% of prompts that Haiku would handle identically. For teams sending thousands of API calls per day, this adds up to significant unnecessary spend.
Hardcoding Haiku sacrifices quality. The 40% of prompts that need Sonnet's reasoning produce noticeably worse results. Code has more bugs. Analysis is shallower. Multi-step tasks lose coherence. The cost savings come at the expense of output quality on every non-trivial task.
This is what a model router solves: automatic per-prompt model selection based on task complexity.
Automatic Routing with Morph Router
Morph Router classifies each prompt by complexity and routes it to the appropriate Claude model. The classification takes approximately 430ms and costs $0.001 per request. It has been trained on millions of coding prompts.
| Complexity | Routed to | Example tasks |
|---|---|---|
| Easy | Haiku 4.5 | Add a comment, rename a variable, format output, simple extraction |
| Medium | Sonnet 4.5 | Write a function, fix a bug, refactor a method, code review |
| Hard | Sonnet 4.5 / Opus 4.5 | System design, race condition debugging, multi-file refactoring |
| Needs info | Asks for clarification | Ambiguous prompts where the right model depends on missing context |
Two routing modes are available. Balanced mode optimizes for cost savings while preserving quality on medium and hard tasks. Aggressive mode routes more prompts to Haiku, maximizing savings at the risk of slightly lower quality on borderline medium-complexity tasks.
Cost Savings Breakdown
Take a typical 50-prompt coding agent session. Each prompt averages 2,000 input tokens and 1,000 output tokens.
| Strategy | Input cost | Output cost | Routing cost | Total |
|---|---|---|---|---|
| All Sonnet | $0.30 | $0.75 | $0 | $1.05 |
| All Haiku | $0.08 | $0.20 | $0 | $0.28 |
| Routed (balanced) | $0.15 | $0.38 | $0.05 | $0.58 |
The routed session costs $0.58 compared to $1.05 for all-Sonnet, a 45% reduction. The 30 easy prompts run on Haiku ($0.048 input + $0.12 output). The 15 medium prompts run on Sonnet ($0.09 input + $0.225 output). The 5 hard prompts run on Sonnet ($0.03 input + $0.075 output). Routing overhead adds $0.05.
The quality difference on the 30 easy prompts is zero: Haiku and Sonnet produce identical results for classification, formatting, and simple extraction. The quality on the 20 medium and hard prompts is preserved because they still run on Sonnet.
At scale, this matters. A team running 1,000 sessions per day saves $470/day or $14,100/month by switching from all-Sonnet to routed. The routing cost at that volume is $50/day.
Code Example
Integration takes a few lines. Instead of hardcoding a model name, call the router before each API request.
Using Morph Router with the Anthropic SDK
import Anthropic from "@anthropic-ai/sdk";
import Morph from "morphllm";
const anthropic = new Anthropic();
const morph = new Morph({ apiKey: process.env.MORPH_API_KEY });
async function chat(userQuery: string) {
// Router picks the model based on task complexity
// Easy tasks → haiku (fast, cheap)
// Medium tasks → sonnet (balanced)
// Hard tasks → sonnet/opus (full power)
const { model } = await morph.routers.anthropic.selectModel({
input: userQuery,
mode: "balanced",
});
// Use the selected model for the actual API call
const response = await anthropic.messages.create({
model, // e.g. "claude-haiku-4-5-20250929" or "claude-sonnet-4-5-20250929"
max_tokens: 4096,
messages: [{ role: "user", content: userQuery }],
});
return response;
}
// Simple task → router picks Haiku
await chat("Rename the variable 'x' to 'userId'");
// Complex task → router picks Sonnet
await chat("Debug this race condition in the connection pool");
// Very hard task → router picks Opus (in aggressive mode)
await chat("Design a distributed task queue with exactly-once delivery");The router returns a model identifier compatible with the Anthropic SDK. No changes to the rest of your code are needed. You can also access the classification directly if you want to log it or apply custom routing logic.
Accessing the classification directly
const result = await morph.routers.anthropic.selectModel({
input: userQuery,
mode: "balanced",
});
console.log(result.model); // "claude-haiku-4-5-20250929"
console.log(result.complexity); // "easy" | "medium" | "hard" | "needs_info"
console.log(result.confidence); // 0.0 - 1.0
// Custom routing logic
if (result.complexity === "hard" && sensitiveOperation) {
// Override to Opus for high-stakes tasks
result.model = "claude-opus-4-5-20250929";
}Frequently Asked Questions
What is the difference between Claude Sonnet and Claude Haiku?
Sonnet 4.5 is Anthropic's balanced model for complex tasks: multi-step reasoning, code generation, and analysis. It costs $3 input / $15 output per million tokens. Haiku 4.5 is the speed model for simple tasks: classification, extraction, formatting. It costs $0.80 input / $4 output per million tokens. Sonnet is 3.75x more expensive but handles tasks Haiku cannot.
When should I use Haiku instead of Sonnet?
Use Haiku for classification, simple extraction (pull a name from text), routing decisions (yes/no, pick-one), formatting and templating, high-volume low-complexity pipelines, and any task where latency matters more than reasoning depth. Haiku handles these at the same quality as Sonnet at 3.75x lower cost.
When should I use Sonnet instead of Haiku?
Use Sonnet for multi-step reasoning, code generation and review, complex analysis, creative writing that needs nuance, and any task where errors are expensive. Sonnet's additional reasoning capacity produces measurably better results on these tasks.
How much cheaper is Haiku than Sonnet?
Haiku is 3.75x cheaper than Sonnet on both input and output tokens. For a coding agent session with 50 prompts, routing 60% of them to Haiku saves 40-60% compared to running everything on Sonnet.
What is a model router and how does it help?
A model router classifies each prompt by complexity and routes it to the appropriate model automatically. Morph Router classifies prompts as easy, medium, hard, or needs_info in approximately 430ms at $0.001 per request. Easy prompts go to Haiku. Medium prompts go to Sonnet. Hard prompts go to Sonnet or Opus. This saves 40-60% compared to all-Sonnet with no quality loss on complex tasks.
When should I use Claude Opus instead of Sonnet?
Use Opus for problems Sonnet gets wrong: novel problem solving, multi-file refactoring across tightly coupled codebases, and architecture decisions requiring many simultaneous constraints. Opus costs $15 input / $75 output per million tokens, which is 5x Sonnet. Most teams need Opus for fewer than 10% of their prompts.
Can I use both Sonnet and Haiku in the same application?
Yes. The most cost-effective approach uses both models with a router that picks per-prompt. In a typical coding agent session, 60% of prompts are simple (Haiku), 30% are medium (Sonnet), and 10% are hard (Sonnet/Opus). Morph Router automates this selection at $0.001 per request.
Related Resources
Stop Hardcoding Model Names
Morph Router automatically picks Haiku, Sonnet, or Opus for each prompt based on task complexity. ~430ms classification, $0.001/request, 40-60% cost savings. Trained on millions of coding prompts.