What is the context window size?

Grok 3 supports a context window of 131,072 tokens for both input and output.

How much does it cost to run?

Input tokens are priced at $3 per million, and output tokens are $15 per million.

Grok 3 for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. Grok 3 is xAI’s current flagship text model — $3/M input, $15/M output, 131K context on both ends. That 131K output limit is the headline feature; GPT-4o caps at 4K output, which becomes genuinely painful when you’re generating substantial amounts of code or documentation.

Specs


Provider	xAI
Input cost	$3.00 / M tokens
Output cost	$15 / M tokens
Context window	131K tokens
Max output	131K tokens
Parameters	N/A
Features	function_calling, web_search

What it’s good at

Output capacity

131K max output tokens is unusual at this tier. If you’ve ever hit a model mid-function because GPT-4o’s 4K limit ran out, Grok 3 solves that problem. Entire module rewrites, long reports, multi-file diffs — it can actually finish them.

Speed

Token generation is faster than Claude 3.5 Sonnet. For interactive agent workflows where latency matters, that difference is noticeable.

Where it falls short

Persona bleed

Grok has a baked-in personality that’s harder to suppress than you’d like. System prompts telling it to be neutral and professional help, but informal or sarcastic phrasing still bleeds through in ways that would be unacceptable in customer-facing contexts.

API reliability

The xAI infrastructure is not as stable as Azure or AWS. Rate limit errors and occasional downtime during high-traffic periods are real issues. If you run this in production, build retry logic.

Best use cases with OpenClaw

Large-scale code generation — The 131K output window means you can prompt for a substantial multi-file structure without the model cutting off mid-function. That alone justifies trying it for the right workloads.
Input-heavy agent loops — At $3/M input tokens, it’s cheaper than GPT-4o for agents that repeatedly ingest large context payloads.

Not ideal for

Strict corporate chatbots — The personality issue is a liability for customer-facing roles. You can prompt around it but you can’t fully eliminate it.
Zero-failure production systems — Until xAI’s API stability catches up to the established providers, it’s a secondary choice for anything mission-critical.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Use the OpenAI-compatible provider in OpenClaw and point the base URL to https://api.x.ai/v1. You will need to manually set the model ID to xai/grok-3 in your configuration file.

{
  "models": {
    "mode": "merge",
    "providers": {
      "xai": {
        "baseUrl": "https://api.x.ai/v1",
        "apiKey": "YOUR-XAI-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "grok-3",
            "name": "Grok 3",
            "cost": {
              "input": 3,
              "output": 15
            },
            "contextWindow": 131072,
            "maxTokens": 131072
          }
        ]
      }
    }
  }
}

How it compares

vs Claude 3.5 Sonnet — Sonnet has superior instruction following and a more professional tone, but Grok 3 wins on output length and raw speed.
vs GPT-4o — GPT-4o offers better tool-calling reliability and a more stable API, while Grok 3 is cheaper for input-heavy workloads at $3 per million tokens.

Bottom line

If you need to generate long outputs and can stomach some API instability, Grok 3 is worth it. The 131K output limit is a real differentiator. Just don’t use it anywhere the persona bleed would be a problem.

TRY GROK 3 ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.