Everyone asks which model to use. The honest answer: it depends on what you’re doing and how much you want to spend.
Note: Clawdbot has been rebranded to OpenClaw — same powerful AI agent platform, new name. Learn more at openclaw.ai.
OpenClaw supports a dozen providers. Anthropic, OpenAI, Google, open-source models through haimaker.ai. Each has tradeoffs around cost, capability, and where your data ends up.
Here’s how I think about picking one.
Price, capability, privacy
These three things compete with each other. You can optimize for two, maybe, but rarely all three.
Price
Token pricing varies wildly. Claude Opus 4.5 costs $15/$75 per million tokens (input/output). Grok 4.1 mini charges $0.20/$0.50. That’s a 75x difference for what are, in many cases, similar outputs.
For most assistant tasks, a mid-tier model makes sense. Claude Sonnet 4 at $3/$15 gives you most of Opus’s capability at a fraction of the cost.
Capability
Benchmarks lie. For OpenClaw, what actually matters:
- Tool calling – Can it invoke shell commands and APIs without fumbling the syntax?
- Context tracking – Does it remember what you said 50 messages ago?
- Code quality – When it writes code, does it run?
- Speed – How long before it starts responding?
Privacy
Cloud APIs mean your prompts hit external servers. For personal finance, health data, or proprietary code, that’s a problem. You can self-host open-source models, but that requires hardware and tolerance for latency.
Recommendations by use case
Daily assistant work
Claude Sonnet 4 ($3/$15 per million tokens)
Calendar, email, research, general queries. Sonnet handles all of it without breaking the bank. Fast enough for real-time chat, smart enough for multi-step tasks.
Cheaper option: GPT-4o-mini (~$0.15/$0.60)
Fine for simple stuff. Quality drops on anything complex, but at 20x cheaper, sometimes that’s the right call.
Coding and automation
Claude Opus 4.5 ($15/$75 per million tokens)
When the code needs to actually work, Opus is worth the premium. It handles multi-file edits and complex debugging better than anything else I’ve used.
Alternative: Sonnet 4 with extended thinking enabled. Pay more per reasoning token only when you need the horsepower.
Research and document analysis
Gemini 3 Pro (~$1.25/$10 per million tokens)
The 1M+ token context window lets you throw entire codebases at it. Good at synthesizing information across long documents.
Gemini models in OpenClaw
Google’s Gemini family is worth a closer look if you’re doing document-heavy work or want a solid mid-range option.
Gemini 3 Pro is the workhorse. That million-token context window means you can feed it an entire repo and ask it to find the bug. For long-document analysis, contract review, or codebase Q&A, nothing else comes close on context length.
Gemini 3 Flash (~$0.075/$0.30 per million tokens) is the speed option. It’s cheap, fast, and surprisingly capable for simpler tasks. If you’re routing high-volume queries and don’t need deep reasoning, Flash handles it well.
To add Gemini models through haimaker.ai, add them to your provider config:
{
models: {
providers: {
haimaker: {
// ... existing config
models: [
// ... existing models
{ id: "google/gemini-3-pro", name: "Gemini 3 Pro" },
{ id: "google/gemini-3-flash", name: "Gemini 3 Flash" }
]
}
}
}
}
Or use them directly through Google’s API if you prefer. OpenClaw supports Google as a first-party provider.
Privacy-sensitive work
Qwen3 Coder or Llama 3.3 70B through haimaker.ai
Open-source models, routed through compliant infrastructure. Your prompts stay off the big providers’ training pipelines.
For maximum paranoia, self-host with Ollama or vLLM. You’ll need serious hardware (2x A100 or equivalent) and patience for higher latency.
Hybrid approach: Use cloud APIs for general work, switch to open-source for sensitive tasks. Clawdbot makes this easy with model overrides.
Provider comparison
Anthropic (Claude)
Premium pricing ($3-$75 per million output tokens). Best tool calling and instruction following. No training on API data by default.
Claude has become the default for coding agents. The tool use is just more reliable than the alternatives.
OpenAI (GPT)
Mid-tier pricing ($0.60-$15). Solid general performance, fast responses. GPT-4o is a good all-rounder. The mini variant works well for high-volume, simple tasks.
Google (Gemini)
Competitive pricing ($1.25-$10). That massive context window is the selling point. Great for document-heavy workflows.
Open source through haimaker.ai
5% below market rate ($0.10-$5 per million tokens). Routes requests across GPU providers for cost and latency optimization. Avoids the compliance headaches of sending data to US hyperscalers.
The API is OpenAI-compatible:
curl https://api.haimaker.ai/v1/chat/completions \
-H "Authorization: Bearer $HAIMAKER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "llama-3.3-70b",
"messages": [{"role": "user", "content": "Hello!"}]
}'
Swap your base URL to https://api.haimaker.ai/v1 and you’re done.
GET $10 FREE CREDITS ON HAIMAKER
OpenClaw configuration
Setting your default model in ~/.openclaw/openclaw.json:
{
agents: { defaults: { model: { primary: "anthropic/claude-sonnet-4-20250514" } } }
}
Switch models mid-session with /model opus or /model haimaker/llama-3.3-70b.
Migration note: If you’re coming from Clawdbot, your config files are automatically migrated. The CLI now uses
openclawinstead ofclawdbotcommands.
Adding haimaker.ai as a provider in OpenClaw
{
env: { HAIMAKER_API_KEY: "sk-..." },
agents: {
defaults: { model: { primary: "haimaker/llama-3.3-70b" } }
},
models: {
mode: "merge",
providers: {
haimaker: {
baseUrl: "https://api.haimaker.ai/v1",
apiKey: "${HAIMAKER_API_KEY}",
api: "openai-completions",
models: [
{ id: "llama-3.3-70b", name: "Llama 3.3 70B" },
{ id: "qwen3-coder", name: "Qwen3 Coder" },
{ id: "mistral-large", name: "Mistral Large" }
]
}
}
}
}
Which model should I choose?
If you’re still unsure, here’s the quick decision tree:
“I just want something that works” — Claude Sonnet 4. It handles 80% of tasks well and the pricing is reasonable. Start here.
“I’m writing production code” — Claude Opus 4.5. The extra cost pays for itself when you’re debugging a gnarly async issue at 2am and the model actually gets it right the first time.
“I need to process long documents” — Gemini 3 Pro. Nothing else gives you a million tokens of context. Feed it the whole repo, the whole contract, the whole thread.
“I need it free” — Local models through Ollama cost nothing. Gemini Flash has a free tier. See our free models guide.
“I need it cheap” — MiniMax M2.5 through haimaker.ai for simple tasks, GPT-OSS-120b when you need more reasoning. See our cost comparison guide.
“I want my data to stay private” — Qwen3 Coder or Llama 3.3 70B through haimaker.ai. Or self-host with Ollama if you have the hardware.
“I want to use Gemini” — Gemini 3 Pro for quality, Gemini 3 Flash for speed and cost. Both available through haimaker.ai or Google’s API directly.
The honest answer is that most people should run two or three models and route between them. Use a cheap model for simple tasks, a mid-tier model for daily work, and a premium model for the hard stuff. Haimaker’s routing engine makes this easy to set up.
Model guides
Full pricing, setup, and use-case breakdown for every model on Haimaker:
Anthropic: Claude Opus 4.6 · Opus 4.5 · Opus 4.1 · Sonnet 4.6 · Sonnet 4.5 · Haiku 4.5
OpenAI: GPT-5.3 · GPT-5 Codex · GPT-5 Pro · GPT-5 · GPT-5 Mini · GPT-5 Nano · GPT-5.4 Pro · GPT-5.4 · GPT-5.2 · GPT-5.1 · GPT-4.1 · GPT-4.1 Mini · GPT-4.1 Nano · GPT-4o · GPT-4o Mini · GPT-4 Turbo · GPT-4 · o3 · o3 Mini · o4 Mini · o1 · o1 Mini
Google: Gemini 3.1 Pro · Gemini 3 Flash · Gemini 2.5 Pro · Gemini 2.5 Flash · Gemini 2.0 Flash
xAI: Grok 4.20 · Grok 4 · Grok 4 Fast · Grok 4.1 Fast · Grok Code Fast · Grok 3 · Grok 3 Mini · Grok 3 Mini Fast · Grok 2 · Grok 2 Vision
DeepSeek: DeepSeek R1 · DeepSeek V3.2 · DeepSeek V3.1 · DeepSeek V3
MiniMax: M2.5 · M2.5 Lightning · M2.1 · M2.1 Lightning · M2
Zhipu: GLM-5 · GLM-4.7 · GLM-4.7 Flash · GLM-4.6 · GLM-4.6 Exacto
Meta: Llama 4 Maverick · Llama 4 Scout
More: Qwen3.5 397B · Qwen3 Max · Qwen3 Coder · Qwen3 Coder Plus · Qwen 2.5 Coder 32B · Kimi K2 · Kimi K2.5 · MiMo V2 Flash · UI-TARS 1.5
Bottom line
There’s no best model. There’s the right model for what you’re doing.
- Cheap: MiniMax M2.5, GPT-4o-mini, or open-source through haimaker.ai
- Capable: Opus 4.5 or Gemini 3 Pro
- Private: Open-source through haimaker.ai or self-hosted
Most people should start with Claude Sonnet 4. It handles most tasks well and won’t run up a scary bill. Adjust from there based on what you actually need.
Ready to set up your own OpenClaw agent? Visit openclaw.ai to get started or check out the OpenClaw documentation for detailed configuration options.