TL;DR: Cursor removed its native apply model. We built an MCP server that brings FastApply (10,500 tok/sec) and Warp-Grep back to Cursor. Result: ~35% faster end-to-end edits compared to search & replace, higher accuracy, lower token costs.
One config file. No workflow changes.
The Problem
Cursor shipped with a fast apply model. Then they removed it.
Now every code edit goes through the main model—slow, expensive, prone to hallucination on large files. Without a specialized apply model, edits take longer and token costs spike when the model rewrites entire files for single-line changes.
The core issue: No specialized apply model means using a reasoning model for a merging task. It's like using GPT-4 to run git merge.
The Decision: MCP + FastApply
Model Context Protocol lets you plug external tools into Cursor. We built an MCP server that:
- Intercepts file edits before they hit Cursor's default flow
- Routes to FastApply (our 10,500 tok/sec merge model)
- Returns merged code in 1-3 seconds for most files
Plus: Warp-Grep sub-agent for semantic code search. Think Cognition's SweeGrip, but available to everyone.
Why Warp-Grep matters:
- Cursor's default grep misses context across file boundaries
- Warp-Grep uses a small planning model to execute multi-step searches
- Returns semantically ranked results, not just text matches
Installation (60 Seconds)
Add to ~/.cursor/mcp.json:
Restart Cursor. Done.
Get your API key: morphllm.com/dashboard/api-keys
Results: Measured Performance
Based on our benchmarks comparing Morph vs search & replace across real codebases.
End-to-End Performance
| Metric | Result |
|---|---|
| Apply latency | 1-3 seconds |
| End-to-end speed improvement | ~35% faster |
| Average speed improvement | ~46% vs search & replace |
Why FastApply Wins
Search & Replace requires a separate tool call for each chunk being edited. Multiple edits = multiple round trips.
FastApply handles all edits to a file in a single call. The model describes what it wants to change, and FastApply merges everything at once.
This means fewer tool calls, less coordination overhead, and faster end-to-end completion.
Warp-Grep Impact
Warp-Grep combines semantic search with ripgrep for better context retrieval.
How it works:
- Parse query into search plan
- Execute multi-step ripgrep commands
- Rerank results by semantic relevance
- Return top-k with surrounding context
This reduces irrelevant files loaded into context, helping agents find the right code faster.
How FastApply Works Under the Hood
Unlike full file rewrites or brittle diffs, FastApply uses a specialized merge model.
Input format:
Output (streamed at 10,500 tok/sec):
The model understands:
// ... existing code ...markers mean "keep this section"- Structural context (indentation, scope, imports)
- Variable renaming across the file
Why this approach works:
- Smaller model trained specifically on merging
- No reasoning overhead—just pattern matching + structural understanding
- Speculative decoding using the original file as a prior
Warp-Grep: Semantic Code Search
Traditional grep: regex match, return all lines.
Warp-Grep: sub-agent workflow that combines grep with semantic understanding.
Example query: "Where do we handle rate limiting?"
Agent plan:
Pricing: $0.30 per 1M input tokens, $0.30 per 1M output tokens.
Configuration Options
Edit-Only Mode
Just want fast edits, no extra filesystem access?
Workspace Mode (Default)
Automatically detects project root by looking for .git, package.json, etc.
No need to hardcode paths. Works across all your projects.
Per-Project Config
Lock MCP to specific directory:
Real-World Use Cases
1. Multiple Edits Per File
FastApply handles all edits to a file in one call. Search & replace needs a separate tool call for each chunk.
2. Multi-File Context
Warp-Grep fetches semantically relevant context in one shot instead of manual grep + copy/paste.
3. Agent Workflows
If you're building agents on Cursor: FastApply's higher accuracy means fewer retries, which speeds up end-to-end task completion.
4. Consistent Edits
Apply suggested changes without worrying about exact string matching. The model understands code structure.
Comparison to Cursor Native Apply
| Feature | Cursor Native (removed) | Morph MCP |
|---|---|---|
| Throughput | ~1000 tok/sec | 10,500 tok/sec |
| Max file size | ~400 lines | 1500+ lines |
| API access | No | Yes |
| Semantic search | No | Warp-Grep |
| Cost | Bundled | $0.80/1M in, $1.20/1M out |
Cursor's original apply model was Llama-3-70B. We use a smaller model trained specifically on code merging with speculative decoding.
Limitations
What FastApply doesn't do:
- Won't write your code from scratch (use Claude/GPT for that)
- No reasoning or planning (pure merge task)
- Requires
// ... existing code ...markers in snippets
Warp-Grep limitations:
- Requires
ripgrepinstalled locally (when using local provider) - Best on structured languages (Python, TS, Go)
- Sub-agent calls add latency (~300ms overhead)
Current max file size: 1500 lines. Training on 2500-line context now.
What You Get
Install Morph MCP:
- 10,500 tok/sec throughput
- 1-3 second apply latency
- ~35% faster end-to-end vs search & replace
- Warp-Grep semantic search
- Higher accuracy means fewer retries
One config file. Works across all Cursor projects.
Get API key • Full MCP docs • Warp-Grep docs
Pricing: $0.80/1M input, $1.20/1M output for FastApply. $0.30/1M tokens for Warp-Grep.
Free tier: 100k tokens/month to try it out.
Why This Matters
The best coding tools separate reasoning from execution.
Cursor's main models (Claude, GPT-4) should think about what to change.
FastApply should merge those changes. Fast.
Warp-Grep should find context. Semantically.
This separation of concerns is how modern coding agents achieve better accuracy and speed.
Now it's available to everyone using Cursor.

