Current as of March 2026. MiMo V2 Flash from Xiaomi is notable mainly for its price: $0.09/M input and $0.29/M output with a 262K context window. That’s a large context window at a very low price. The tradeoffs are an opaque architecture and some reliability issues with complex tool schemas.

Specs

ProviderXiaomi
Input cost$0.09 / M tokens
Output cost$0.29 / M tokens
Context window262K tokens
Max output16K tokens
ParametersN/A
Featuresfunction_calling, reasoning

What it’s good at

Price-to-Context Ratio

$0.09/M for 262K context is hard to beat. For tasks that are mostly about reading a lot of data cheaply, this model undercuts almost everything else.

Cost for High-Volume Work

At $0.29/M output, running thousands of agent cycles stays affordable. If your workload is repetitive and doesn’t require complex reasoning, the economics work.

Where it falls short

Opaque Architecture

Xiaomi publishes no architectural details. You can’t predict failure modes analytically — you have to find them empirically. Budget time for that.

Tool Call Fragility

The function calling support exists, but I’ve seen it hallucinate arguments or drop parameters on schemas with more than five or six properties. Keep your tool schemas simple.

Best use cases with OpenClaw

  • Long-form Document Analysis — 262K context at $0.09/M makes reading large technical manuals or codebases very cheap.
  • High-Volume Log Processing — Scan thousands of lines of server logs to surface specific errors. The reasoning feature helps, and the cost stays low.

Not ideal for

  • Complex Multi-step Logic — The reasoning is optimized for speed over depth. It struggles with deep chains of thought.
  • Strict JSON Extraction — It occasionally adds conversational filler or misses closing braces even with explicit schema instructions.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Configure your OpenClaw provider to use the Haimaker API at api.haimaker.ai/v1 and set the model ID to xiaomi/mimo-v2-flash.

{
  "models": {
    "mode": "merge",
    "providers": {
      "xiaomi": {
        "baseUrl": "https://api.haimaker.ai/v1",
        "apiKey": "YOUR-XIAOMI-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "mimo-v2-flash",
            "name": "MiMo V2 Flash",
            "cost": {
              "input": 0.09,
              "output": 0.29
            },
            "contextWindow": 262144,
            "maxTokens": 16384
          }
        ]
      }
    }
  }
}

How it compares

  • vs GPT-4o-mini — MiMo is cheaper on input ($0.09 vs $0.15) and has double the context window. GPT-4o-mini is more reliable for structured extraction.
  • vs Gemini 1.5 Flash — Gemini’s 1M context window is larger, but MiMo is cheaper per token if you’re staying within 262K.

Bottom line

The best price-per-context-token option currently available. Use it for read-heavy tasks where you need a large window and simple outputs, and be prepared for some tool call unreliability on complex schemas.

TRY MIMO V2 FLASH ON HAIMAKER


For setup instructions, see our API key guide. For all available models, see the complete models guide.