Bringing FastApply Back to Cursor with MCP

How Morph MCP Server restores 10,500 tok/sec edits to Cursor and adds Warp-Grep for better context retrieval

Tejas Bhakta
Tejas Bhakta
November 24, 20256 min read
Bringing FastApply Back to Cursor with MCP

TL;DR: Cursor removed its native apply model. We built an MCP server that brings FastApply (10,500 tok/sec) and Warp-Grep back to Cursor. Result: ~35% faster end-to-end edits compared to search & replace, higher accuracy, lower token costs.

One config file. No workflow changes.


The Problem

Cursor shipped with a fast apply model. Then they removed it.

Now every code edit goes through the main model—slow, expensive, prone to hallucination on large files. Without a specialized apply model, edits take longer and token costs spike when the model rewrites entire files for single-line changes.

The core issue: No specialized apply model means using a reasoning model for a merging task. It's like using GPT-4 to run git merge.


The Decision: MCP + FastApply

Model Context Protocol lets you plug external tools into Cursor. We built an MCP server that:

  1. Intercepts file edits before they hit Cursor's default flow
  2. Routes to FastApply (our 10,500 tok/sec merge model)
  3. Returns merged code in 1-3 seconds for most files

Plus: Warp-Grep sub-agent for semantic code search. Think Cognition's SweeGrip, but available to everyone.

Why Warp-Grep matters:

  • Cursor's default grep misses context across file boundaries
  • Warp-Grep uses a small planning model to execute multi-step searches
  • Returns semantically ranked results, not just text matches

Installation (60 Seconds)

Add to ~/.cursor/mcp.json:

json

Restart Cursor. Done.

Get your API key: morphllm.com/dashboard/api-keys


Results: Measured Performance

Based on our benchmarks comparing Morph vs search & replace across real codebases.

End-to-End Performance

MetricResult
Apply latency1-3 seconds
End-to-end speed improvement~35% faster
Average speed improvement~46% vs search & replace

Why FastApply Wins

Search & Replace requires a separate tool call for each chunk being edited. Multiple edits = multiple round trips.

FastApply handles all edits to a file in a single call. The model describes what it wants to change, and FastApply merges everything at once.

This means fewer tool calls, less coordination overhead, and faster end-to-end completion.

Warp-Grep Impact

Warp-Grep combines semantic search with ripgrep for better context retrieval.

How it works:

  1. Parse query into search plan
  2. Execute multi-step ripgrep commands
  3. Rerank results by semantic relevance
  4. Return top-k with surrounding context

This reduces irrelevant files loaded into context, helping agents find the right code faster.


How FastApply Works Under the Hood

Unlike full file rewrites or brittle diffs, FastApply uses a specialized merge model.

Input format:

typescript

Output (streamed at 10,500 tok/sec):

python

The model understands:

  • // ... existing code ... markers mean "keep this section"
  • Structural context (indentation, scope, imports)
  • Variable renaming across the file

Why this approach works:

  1. Smaller model trained specifically on merging
  2. No reasoning overhead—just pattern matching + structural understanding
  3. Speculative decoding using the original file as a prior

Warp-Grep: Semantic Code Search

Traditional grep: regex match, return all lines.

Warp-Grep: sub-agent workflow that combines grep with semantic understanding.

Example query: "Where do we handle rate limiting?"

Agent plan:

bash

Pricing: $0.30 per 1M input tokens, $0.30 per 1M output tokens.


Configuration Options

Edit-Only Mode

Just want fast edits, no extra filesystem access?

json

Workspace Mode (Default)

Automatically detects project root by looking for .git, package.json, etc.

No need to hardcode paths. Works across all your projects.

Per-Project Config

Lock MCP to specific directory:

json

Real-World Use Cases

1. Multiple Edits Per File

FastApply handles all edits to a file in one call. Search & replace needs a separate tool call for each chunk.

2. Multi-File Context

Warp-Grep fetches semantically relevant context in one shot instead of manual grep + copy/paste.

3. Agent Workflows

If you're building agents on Cursor: FastApply's higher accuracy means fewer retries, which speeds up end-to-end task completion.

4. Consistent Edits

Apply suggested changes without worrying about exact string matching. The model understands code structure.


Comparison to Cursor Native Apply

FeatureCursor Native (removed)Morph MCP
Throughput~1000 tok/sec10,500 tok/sec
Max file size~400 lines1500+ lines
API accessNoYes
Semantic searchNoWarp-Grep
CostBundled$0.80/1M in, $1.20/1M out

Cursor's original apply model was Llama-3-70B. We use a smaller model trained specifically on code merging with speculative decoding.


Limitations

What FastApply doesn't do:

  • Won't write your code from scratch (use Claude/GPT for that)
  • No reasoning or planning (pure merge task)
  • Requires // ... existing code ... markers in snippets

Warp-Grep limitations:

  • Requires ripgrep installed locally (when using local provider)
  • Best on structured languages (Python, TS, Go)
  • Sub-agent calls add latency (~300ms overhead)

Current max file size: 1500 lines. Training on 2500-line context now.


What You Get

Install Morph MCP:

  • 10,500 tok/sec throughput
  • 1-3 second apply latency
  • ~35% faster end-to-end vs search & replace
  • Warp-Grep semantic search
  • Higher accuracy means fewer retries

One config file. Works across all Cursor projects.

Get API keyFull MCP docsWarp-Grep docs

Pricing: $0.80/1M input, $1.20/1M output for FastApply. $0.30/1M tokens for Warp-Grep.

Free tier: 100k tokens/month to try it out.


Why This Matters

The best coding tools separate reasoning from execution.

Cursor's main models (Claude, GPT-4) should think about what to change.

FastApply should merge those changes. Fast.

Warp-Grep should find context. Semantically.

This separation of concerns is how modern coding agents achieve better accuracy and speed.

Now it's available to everyone using Cursor.