Current as of March 2026. Grok 3 Mini Fast is the speed-optimized version of Grok 3 Mini — $0.60/M input, $4/M output, 131K context. The trade-off versus the standard Mini is that output costs 8x more per token ($4 vs $0.50), which can catch you off guard if your agents are chatty.

Specs

ProviderxAI
Input cost$0.60 / M tokens
Output cost$4.00 / M tokens
Context window131K tokens
Max output131K tokens
ParametersN/A
Featuresfunction_calling, reasoning, web_search

What it’s good at

Speed

The “Fast” designation is accurate. Reasoning-heavy responses come back noticeably faster than standard Grok 3 or GPT-4o. For interactive agent loops where latency is the limiting factor, this matters.

Output window

131K max output on a mini model is unusual. Comparable small models often cap at 4K, which forces you to break large generation tasks into sequential calls.

Input price

$0.60/M input is competitive for a reasoning-capable model. The output side is where the cost escalates — keep that in mind before running verbose agents.

Where it falls short

Output cost

$4/M output is nearly seven times the input rate. If your agent generates long code blocks or detailed reports regularly, costs accumulate faster than the input price suggests. This model rewards concise outputs.

Proprietary constraints

No self-hosting, no fine-tuning. If you need model-level customization for specific domains, you’re stuck.

Reasoning depth

It prioritizes speed over thoroughness. The standard Grok 3 or OpenAI’s o1 will catch edge cases this model glosses over. Don’t use it for anything where the reasoning chain needs to be airtight.

Best use cases with OpenClaw

  • High-frequency agent loops — Low latency and native function calling work well for OpenClaw agents that cycle through tools rapidly. The fast responses keep the loop tight.
  • Large context summarization — 131K context handles large codebases or log files that would exceed the limits of smaller models. Just watch the output length to keep costs down.

Not ideal for

  • Output-heavy tasks on a tight budget — The $4/M output price is a real number. If your agent generates 100K output tokens per run, that’s $0.40 per run. It adds up.
  • Mission-critical logic — The model cuts corners to maintain speed. Mathematical or formal logic verification needs a slower, more careful model.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Configure OpenClaw to use the OpenAI-compatible provider but override the base URL to api.x.ai/v1. You will need a valid X API key and must specify the model ID as xai/grok-3-mini-fast.

{
  "models": {
    "mode": "merge",
    "providers": {
      "xai": {
        "baseUrl": "https://api.x.ai/v1",
        "apiKey": "YOUR-XAI-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "grok-3-mini-fast",
            "name": "Grok 3 Mini Fast",
            "cost": {
              "input": 0.6,
              "output": 4
            },
            "contextWindow": 131072,
            "maxTokens": 131072
          }
        ]
      }
    }
  }
}

How it compares

  • vs GPT-4o-mini — GPT-4o-mini is cheaper at $0.15/1M input, but Grok 3 Mini Fast has superior reasoning capabilities and a much larger output buffer.
  • vs Claude 3 Haiku — Haiku is more stable for strict JSON formatting, but Grok’s 131K output limit makes it better for generating long-form content.
  • vs Gemini 1.5 Flash — Gemini offers a larger 1M context window, but Grok 3 Mini Fast typically feels more responsive in interactive agentic loops.

Bottom line

Pick this over the standard Grok 3 Mini when latency is your bottleneck and your agents don’t generate large outputs. If they do, the standard Mini’s $0.50/M output cost is much more forgiving.

TRY GROK 3 MINI FAST ON HAIMAKER


For setup instructions, see our API key guide. For all available models, see the complete models guide.