Current as of March 2026. MiniMax M2.1 is cheaper than M2 — $0.27 input, $0.95 output — and the standout number here is the output limit: 197K tokens. That’s the same as the context window, which is unusual. You can throw a lot of data in and get a lot of data back for under a dollar per million tokens.
Specs
| Provider | MiniMax |
| Input cost | $0.27 / M tokens |
| Output cost | $0.95 / M tokens |
| Context window | 197K tokens |
| Max output | 197K tokens |
| Parameters | N/A |
| Features | function_calling |
What it’s good at
Output Capacity
197K tokens out in a single response. For the price, that’s hard to match. You could generate an entire large module or dump a deeply transformed document without hitting a wall.
Pricing
$0.27 input, $0.95 output. Claude 3.5 Sonnet is $3/$15. The math favors M2.1 heavily for high-volume text work where you don’t need frontier reasoning.
Where it falls short
Geographic Latency
If your servers aren’t close to MiniMax’s routing nodes, TTFT gets erratic. I’ve seen swings of several seconds on the same prompt depending on time of day.
Instruction Drift at Long Context
Past about 120K tokens in the context, the model starts losing track of system prompt constraints. You’ll need to repeat key instructions or restructure your prompts if you’re filling the window.
Best use cases with OpenClaw
- Large Document Synthesis — Feed multiple PDFs or long transcripts into a single prompt and get a comprehensive output without chunking.
- High-Volume Data Extraction — Function calling works reliably here, and the pricing makes it viable for processing millions of rows.
Not ideal for
- Complex Logical Reasoning — It makes small but real errors in multi-step logic. Don’t use it for math-heavy pipelines.
- Real-time Chat — Network overhead is too inconsistent for anything user-facing that needs to feel responsive.
Run it through Haimaker
Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:
Add Haimaker as a custom provider to my OpenClaw config. Use these details:
- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions
Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)
Create an alias "auto" for easy switching. Apply the config when done.
Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.
OpenClaw setup
Configure your OpenClaw provider to use the OpenAI-compatible endpoint at api.haimaker.ai/v1. Set the model ID to minimax/minimax-m2.1 and ensure your timeout is set high to accommodate long-context processing.
{
"models": {
"mode": "merge",
"providers": {
"minimax": {
"baseUrl": "https://api.haimaker.ai/v1",
"apiKey": "YOUR-MINIMAX-API-KEY",
"api": "openai-completions",
"models": [
{
"id": "minimax-m2.1",
"name": "Minimax M2.1",
"cost": {
"input": 0.27,
"output": 0.95
},
"contextWindow": 196608,
"maxTokens": 196608
}
]
}
}
}
}
How it compares
- vs GPT-4o-mini — 4o-mini is cheaper on input ($0.15/M) but the output cap (16K) is nowhere near M2.1’s 197K. If you need long outputs, the comparison isn’t even close.
- vs DeepSeek-V3 — DeepSeek is better at reasoning. M2.1 wins if you’re optimizing for raw throughput at low cost.
Bottom line
Use M2.1 for high-volume text processing where the output is long and the logic is straightforward. Don’t expect it to match frontier reasoning models on hard problems.
For setup instructions, see our API key guide. For all available models, see the complete models guide.