What are the token costs for Grok 4.1 Fast?

Input tokens cost $0.2 per million and output tokens cost $0.5 per million.

How large is the context window?

The model supports a maximum of 2,000,000 input tokens and can generate up to 2,000,000 output tokens.

Grok 4.1 Fast for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. Grok 4.1 Fast has a 2 million token context window at $0.20/M input and $0.50/M output. Those numbers are hard to parse without context: you could feed it the entire Linux kernel source and still have room. The trade-off is that reasoning quality drops at this scale and the writing is unremarkable.

Specs


Provider	xAI
Input cost	$0.20 / M tokens
Output cost	$0.50 / M tokens
Context window	2M tokens
Max output	2M tokens
Parameters	N/A
Features	function_calling, vision, reasoning, web_search

What it’s good at

Context window

2M tokens is in a different category from every other model listed here. For use cases where you genuinely need to ingest entire repositories or large document collections without RAG, this is currently one of very few options that can do it cheaply.

Price

$0.20/M input and $0.50/M output is extremely cheap for the capability level. High-frequency automation tasks that would be expensive on GPT-4o or Claude become practical at this price.

Where it falls short

Reasoning consistency

At 2M tokens, the model loses track of things. It hallucinates details from deep in the context and can fail multi-step logical chains that a smaller, more focused model handles cleanly. If you’re using the full context window, expect to verify outputs.

Prose quality

The writing is functional and flat. It lacks the stylistic range of the Claude family. For anything that needs to sound good — documentation, emails, explanations to end users — this is the wrong choice.

Best use cases with OpenClaw

Large-scale document summarization — Cheap enough to summarize thousands of pages in bulk. The 2M window means you rarely need to split documents.
High-frequency agent tasks — $0.50/M output makes repetitive agent tasks like data cleaning, log monitoring, or structured extraction economical at volume.

Not ideal for

Complex reasoning tasks — Multi-step logic, architectural decisions, or anything requiring careful chain-of-thought. The model shortcuts too much at this context scale.
Creative or voice-specific writing — Instruction following for specific tones or styles is weaker than Anthropic’s offerings. Don’t use it to write anything that needs personality.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Use the OpenAI-compatible provider setting and point the base URL to api.x.ai/v1. You must explicitly set the model ID to xai/grok-4-1-fast and ensure your API key has sufficient credits, as xAI uses a pre-paid model.

{
  "models": {
    "mode": "merge",
    "providers": {
      "xai": {
        "baseUrl": "https://api.x.ai/v1",
        "apiKey": "YOUR-XAI-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "grok-4-1-fast",
            "name": "Grok 4.1 Fast",
            "cost": {
              "input": 0.2,
              "output": 0.5
            },
            "contextWindow": 2000000,
            "maxTokens": 2000000
          }
        ]
      }
    }
  }
}

How it compares

vs GPT-4o-mini — Grok 4.1 Fast provides a 2M context window compared to GPT-4o-mini’s 128k, though the latter is often more reliable for short, instruction-heavy tasks.
vs Claude 3.5 Haiku — Haiku has better coding logic, but Grok 4.1 Fast is cheaper for input and offers vastly more context for processing large files.

Bottom line

If you need to process massive amounts of text cheaply and speed matters more than reasoning depth, this is the model for it. Use it for bulk extraction and summarization where you verify the outputs — not for anything where correctness is assumed.

TRY GROK 4.1 FAST ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.