What is the maximum context size for Grok 4?

Grok 4 supports up to 256,000 tokens for both input and output.

How much does it cost to use?

The API costs $3.00 per million input tokens and $15.00 per million output tokens.

Grok 4 for OpenClaw: Is It Worth $3/M? Setup + Review

Current as of March 2026. Grok 4 doubles Grok 3’s context to 256K — both input and output. Same $3/$15 price point. It’s essentially xAI’s answer to the question of what happens when you can fit an entire legacy codebase into one prompt and get back a complete rewrite.

Specs


Provider	xAI
Input cost	$3.00 / M tokens
Output cost	$15 / M tokens
Context window	256K tokens
Max output	256K tokens
Parameters	N/A
Features	function_calling, web_search

What it’s good at

Token limits

256K on both sides is the story here. Most frontier models either have a large input window or a generous output limit — not both. GPT-4o caps output at 4K. Being able to send and receive 256K tokens in a single call changes what’s possible for code generation and refactoring tasks.

Input pricing

$3/M input is cheaper than GPT-4o’s $5/M. For RAG pipelines or any workflow that repeatedly ingests large documents, that difference accumulates quickly.

Where it falls short

Reasoning drift

Long context retrieval is not Grok 4’s strength. If you need the model to precisely locate and reason about something buried in a 200K token document, Claude 3.5 Sonnet is more reliable. Grok 4 can miss things that are semantically distant from the end of the prompt.

Instruction adherence

It occasionally ignores negative constraints in system prompts — “do not do X” type instructions. You often need to rephrase constraints positively or repeat them to make them stick, which is more work than it should be at this price point.

Best use cases with OpenClaw

Large-scale code refactoring — The 256K output buffer means you can pipe in an entire legacy module and get a fully rewritten version back in one shot. This is genuinely useful for the right migration tasks.
High-volume data summarization — $3/M input makes processing thousands of customer support logs or documents economical at scale.

Not ideal for

Zero-latency chatbots — Time-to-first-token can lag when the context is light, compared to models tuned for fast interactive responses.
Formal logic verification — The chain-of-thought reasoning isn’t as stable as models specifically tuned for mathematical work. Don’t use it for proofs or constraint solving.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Configure OpenClaw to use the OpenAI provider but override the base URL to https://api.x.ai/v1. Ensure you set the max_tokens parameter to 262144 to take full advantage of the output window.

{
  "models": {
    "mode": "merge",
    "providers": {
      "xai": {
        "baseUrl": "https://api.x.ai/v1",
        "apiKey": "YOUR-XAI-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "grok-4",
            "name": "Grok 4",
            "cost": {
              "input": 3,
              "output": 15
            },
            "contextWindow": 256000,
            "maxTokens": 256000
          }
        ]
      }
    }
  }
}

How it compares

vs GPT-4o — Grok 4 is cheaper for inputs ($3 vs $5 per 1M) and offers a 256K output limit compared to GPT-4o’s 4K limit.
vs Claude 3.5 Sonnet — Sonnet has superior coding logic, but Grok 4 provides a much larger context window (256K vs 200K) and integrated web search.

Bottom line

Grok 4 is the right choice when token volume is your primary constraint — specifically when you need both a large input and a large output in the same call. For precision work or strict instruction following, something else will serve you better.

TRY GROK 4 ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.