Most OpenClaw guides assume you pick one model and stick with it. That’s fine if you only run the agent for short sessions, but for an always-on setup it leaves money on the table — and capability on the table too.
A coding agent’s traffic is bimodal. Most requests are easy (read this file, what does this function do, generate a CRUD endpoint). A few are hard (debug this race condition, plan a migration, read across 12 files and propose a refactor). Paying frontier prices for the easy 80% and asking a $0.30/M model to do the hard 20% are both wrong.
The fix is multi-model: configure several providers in OpenClaw, set a cheap default, switch up when the task needs it.
What “multi-model” actually means in OpenClaw
OpenClaw’s config supports three things that work together:
- Multiple providers in
models.providers— Anthropic, OpenAI, Google, xAI, Ollama, Haimaker, custom OpenAI-compatible endpoints. Add as many as you want. - Per-model allowlisting in
agents.defaults.models— defining a provider isn’t enough; each model also has to be allowlisted with a fully-qualified name likehaimaker/minimax-m2.5. - Aliases so you can
/model sonnetinstead of/model anthropic/claude-sonnet-4-6-20260514.
Once they’re all wired up, switching is one command.
The minimum useful setup: 2 models
Start here. Cheap default + frontier fallback.
{
models: {
mode: "merge",
providers: {
haimaker: {
baseUrl: "https://api.haimaker.ai/v1",
apiKey: "${HAIMAKER_API_KEY}",
api: "openai-completions",
models: [
{ id: "minimax/minimax-m2.5", name: "MiniMax M2.5", reasoning: false, contextWindow: 196608, maxTokens: 196608 }
]
},
anthropic: {
apiKey: "${ANTHROPIC_API_KEY}",
models: [
{ id: "claude-sonnet-4-6-20260514", name: "Claude Sonnet 4.6", reasoning: true, contextWindow: 1000000, maxTokens: 64000 }
]
}
}
},
agents: {
defaults: {
model: { primary: "haimaker/minimax/minimax-m2.5" },
models: {
"haimaker/minimax/minimax-m2.5": { alias: "minimax" },
"anthropic/claude-sonnet-4-6-20260514": { alias: "sonnet" }
}
}
}
}
Apply with openclaw gateway config.apply. Then in any session:
/model minimax # default, on by default after apply
/model sonnet # when the cheap model isn't getting it
/model # show current model
/models # list all available
That’s the whole multi-model loop. Everything below is variations on this theme.
The typical setup: 4 models, one alias each
A real always-on agent usually wants four roles covered:
- Cheap default — high-volume daily work
- Long-context option — for when the agent has to read a lot
- Local/private option — for sensitive code or zero-cost bulk work
- Frontier fallback — for the hard 20%
{
models: {
mode: "merge",
providers: {
haimaker: {
baseUrl: "https://api.haimaker.ai/v1",
apiKey: "${HAIMAKER_API_KEY}",
api: "openai-completions",
models: [
{ id: "minimax/minimax-m2.5", name: "MiniMax M2.5", contextWindow: 196608 },
{ id: "openai/gpt-oss-120b", name: "GPT-OSS-120b", reasoning: true, contextWindow: 128000 }
]
},
xai: {
baseUrl: "https://api.x.ai/v1",
apiKey: "${XAI_API_KEY}",
api: "openai-completions",
models: [
{ id: "grok-4-1-fast", name: "Grok 4.1 Fast", contextWindow: 2000000 }
]
},
anthropic: {
apiKey: "${ANTHROPIC_API_KEY}",
models: [
{ id: "claude-sonnet-4-6-20260514", name: "Claude Sonnet 4.6", reasoning: true, contextWindow: 1000000 }
]
},
ollama: {
baseUrl: "http://localhost:11434/v1",
api: "openai-completions",
models: [
{ id: "qwen3.6:27b", name: "Qwen3.6 27B Local", contextWindow: 131072 }
]
}
}
},
agents: {
defaults: {
model: { primary: "haimaker/minimax/minimax-m2.5" },
models: {
"haimaker/minimax/minimax-m2.5": { alias: "minimax" },
"haimaker/openai/gpt-oss-120b": { alias: "oss" },
"xai/grok-4-1-fast": { alias: "grok" },
"anthropic/claude-sonnet-4-6-20260514": { alias: "sonnet" },
"ollama/qwen3.6:27b": { alias: "local" }
}
}
}
}
Now the agent has shortcuts:
/model minimax # cheap default
/model grok # when you need 2M context
/model local # private/free
/model sonnet # when the cheap one is wrong
You’ll find yourself reaching for sonnet less often than you’d guess. MiniMax M2.5 handles more than people credit it for. (See cheapest API for AI coding agents for the cost math.)
Configuring a thinking model alongside a primary
OpenClaw’s model block supports a thinking slot for agents that should plan with one model and execute with another:
{
agents: {
defaults: {
model: {
primary: "haimaker/minimax/minimax-m2.5",
thinking: "anthropic/claude-sonnet-4-6-20260514"
}
}
}
}
The cheap model does the file reads, tool calls, and output. The thinking model gets called for the planning step before complex tasks. You pay frontier prices only for the steps that need them.
This is the closest OpenClaw gets to built-in routing without an external router.
Switching providers without touching JSON
If you don’t want to manage a multi-provider config yourself, two options are easier:
1. Ask OpenClaw to do it. Paste a prompt into the OpenClaw chat:
Add Anthropic as a provider in my OpenClaw config with my API key {KEY}.
Add Claude Sonnet 4.6 with alias "sonnet". Apply when done.
OpenClaw edits the config file and runs gateway config.apply for you. (Full walkthrough: OpenClaw custom provider setup.)
2. Point everything at Haimaker. Use haimaker/auto as your primary and the auto-router decides per-request which underlying model to use. You only configure one provider; Haimaker handles the routing. Typical mix on a coding agent: 55% MiniMax, 25% GPT-OSS, 20% Claude. Blended cost stays well under $1/M tokens without you writing rules.
Common multi-model gotchas
“Model not allowed” after adding a provider. You added the provider but forgot to allowlist the specific model in agents.defaults.models. The key has to be the fully-qualified name (haimaker/minimax/minimax-m2.5), not the bare model id. (Full troubleshooting.)
/model name-of-thing doesn’t switch. Either the alias isn’t defined, or you’re using the FQN instead of the alias. Run /models first to see exact strings OpenClaw recognizes.
Tool calls work on one model, fail on another. Some models handle OpenClaw’s tool schema better than others. Qwen3.6, Claude, GPT-5, and Grok 4.x all work reliably. Older Llama variants and some smaller open-source models don’t. If a model breaks tools, set "reasoning": false and try again — sometimes the schema mismatch is reasoning-mode specific.
Context window mismatch. Each provider enforces its own context limit. Set contextWindow accurately per model in the provider config. If you set it too high, you get truncation surprises in long sessions.
Use this
For a new always-on agent: configure three models — a cheap default (MiniMax M2.5 or DeepSeek), a long-context option (Grok 4.1 Fast), a frontier fallback (Claude Sonnet 4.6). Set the cheap one as primary. Switch with /model when the cheap one stalls.
For a privacy-sensitive agent: add Ollama with Qwen3.6 27B as a fourth, switch to it for anything you don’t want sent to a third party.
For a hands-off setup: use haimaker/auto and skip the per-task switching entirely.
Related: OpenClaw custom provider setup, Haimaker auto-router for OpenClaw, Cheapest API for AI coding agents, Best models for OpenClaw.