Morph Models

Fast general models for agent loops

Run the primary agent loop on fast, OpenAI-compatible coding models served on Morph's custom kernels. One API for chat, code generation, and reasoning.
50/50 TESTS PASS
UPDATED
0 Tok/s
Morph Models

Frontier coding models, served on custom kernels

Output speed

Codegen-specific optimizations and custom GPU kernels. Up to 200 tok/s on Qwen 3.5 397B.

200Qwen 3.5397B150DSV4 Flash100Qwen 3.6 27B90MiniMax M2.7150DSV4 ProContact us →

One OpenAI-compatible API

Point your existing client at api.morphllm.com. Switch models by changing one string.

01
import OpenAI from "openai";
02
 
03
const client = new OpenAI({
04
  baseURL: "https://api.morphllm.com/v1",
05
  apiKey: process.env.MORPH_API_KEY,
06
});
07
 
08
const res = await client.chat.completions.create({
09
  model: "morph-qwen35-397b",
10
  messages: [{ role: "user", content: "Refactor this function..." }],
11
});

The lineup

Open-weight frontier models with long context, served and billed per token. No per-seat fees.

01
// Available general models
02
morph-qwen35-397b      // 397B MoE, 262k context
03
morph-minimax27-230b   // 230B MoE, agentic workflows
04
morph-dsv4flash        // 393k context, fast
05
morph-qwen36-27b       // dense, low latency

Built for production agent workloads




Monitor usage across every model

Monitor usage across every model


Run your first model call in minutes.

Get API Key

Free tier available. Pay only for what you use.