Compaction Benchmarks

Token economics of multi-turn coding agents.

Compacting Agent History

Most compaction methods trade accuracy for compression. Task-aware compaction breaks the tradeoff.

LongCodeBench · Long Code QA · 108 repos · contexts up to 1M tokens

How much can you compress without losing accuracy?

ACCURACY ↑

59%57%55%53%51%49%

↗ more compression, higher accuracy

full context baseline

Full Context

RAG

Selective-Ctx

LongCodeZip

LLMLingua2

Claude Code

Codex

Flash-Compact

14.84× · 58.71%

0×3×6×9×12×15×

COMPRESSION RATIO→ more is better

Compacting Tool Calls

Agents solve problems faster when they operate on cleaner context.

Claude Opus 4.6 · SWE-bench Pro · same pass rate

Fewer tokens per task, fewer rounds to solve

Total Tokens

Agent Rounds

w/o Flash Compact

w/ Flash Compact

Does Compaction Make Sense For You?

Plug in your agent's numbers. The math does the rest.

Context per turn

1K5K20K

Agent response + user input + tool calls

Compact at % of context window

10%50% (100K)90%

200K context window · 5× compression · 100 steps

No Compaction

$52.13

500K final context

With Compaction

$23.98

via Gemini Flash

Saved

$28.14

54% reduction

Context Size Over Steps

No compact

With compact

Cumulative Cost Over Steps

No compact

With compact

Assumptions▼

Claude Opus 4.6: $15/M input (fresh), $1.5/M input (cached), $75/M output

Gemini Flash: $0.15/M input, $0.6/M output

5× compression ratio · 1000 output tokens/turn

200K context window · 100 fixed steps

Cut your agent's token bill

Morph Compaction is available as an API. Drop it into any agent framework in under 10 lines.

Get Started Book a Call Run Benchmark

Morph Fast Apply

Morph WarpGrep

Morph Compact

Morph Glance

Morph MCP

Morph Monitor

Blog

Startup Credits

Students

Contact Us

About

Careers

Compaction Benchmarks

Compacting Agent History

Compacting Tool Calls

Does Compaction Make Sense For You?

Cut your agent's token bill