Current as of March 2026. MiMo V2 Flash from Xiaomi is notable mainly for its price: $0.09/M input and $0.29/M output with a 262K context window. That’s a large context window at a very low price. The tradeoffs are an opaque architecture and some reliability issues with complex tool schemas.
Specs
| Provider | Xiaomi |
| Input cost | $0.09 / M tokens |
| Output cost | $0.29 / M tokens |
| Context window | 262K tokens |
| Max output | 16K tokens |
| Parameters | N/A |
| Features | function_calling, reasoning |
What it’s good at
Price-to-Context Ratio
$0.09/M for 262K context is hard to beat. For tasks that are mostly about reading a lot of data cheaply, this model undercuts almost everything else.
Cost for High-Volume Work
At $0.29/M output, running thousands of agent cycles stays affordable. If your workload is repetitive and doesn’t require complex reasoning, the economics work.
Where it falls short
Opaque Architecture
Xiaomi publishes no architectural details. You can’t predict failure modes analytically — you have to find them empirically. Budget time for that.
Tool Call Fragility
The function calling support exists, but I’ve seen it hallucinate arguments or drop parameters on schemas with more than five or six properties. Keep your tool schemas simple.
Best use cases with OpenClaw
- Long-form Document Analysis — 262K context at $0.09/M makes reading large technical manuals or codebases very cheap.
- High-Volume Log Processing — Scan thousands of lines of server logs to surface specific errors. The reasoning feature helps, and the cost stays low.
Not ideal for
- Complex Multi-step Logic — The reasoning is optimized for speed over depth. It struggles with deep chains of thought.
- Strict JSON Extraction — It occasionally adds conversational filler or misses closing braces even with explicit schema instructions.
Run it through Haimaker
Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:
Add Haimaker as a custom provider to my OpenClaw config. Use these details:
- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions
Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)
Create an alias "auto" for easy switching. Apply the config when done.
Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.
OpenClaw setup
Configure your OpenClaw provider to use the Haimaker API at api.haimaker.ai/v1 and set the model ID to xiaomi/mimo-v2-flash.
{
"models": {
"mode": "merge",
"providers": {
"xiaomi": {
"baseUrl": "https://api.haimaker.ai/v1",
"apiKey": "YOUR-XIAOMI-API-KEY",
"api": "openai-completions",
"models": [
{
"id": "mimo-v2-flash",
"name": "MiMo V2 Flash",
"cost": {
"input": 0.09,
"output": 0.29
},
"contextWindow": 262144,
"maxTokens": 16384
}
]
}
}
}
}
How it compares
- vs GPT-4o-mini — MiMo is cheaper on input ($0.09 vs $0.15) and has double the context window. GPT-4o-mini is more reliable for structured extraction.
- vs Gemini 1.5 Flash — Gemini’s 1M context window is larger, but MiMo is cheaper per token if you’re staying within 262K.
Bottom line
The best price-per-context-token option currently available. Use it for read-heavy tasks where you need a large window and simple outputs, and be prepared for some tool call unreliability on complex schemas.
For setup instructions, see our API key guide. For all available models, see the complete models guide.