What is the exact cost per million tokens?

Input tokens cost $2.00 and output tokens cost $10.00 per million.

How large is the context window?

Grok 2 supports 131,072 tokens for both input and output.

Does it support function calling?

Yes, it supports native function calling which maps directly to OpenClaw's tool-use architecture.

Grok 2 for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. Grok 2 is xAI’s previous flagship — $2/M input, $10/M output, 131K context. The OpenAI-compatible API means dropping it into OpenClaw takes about two minutes. The question is whether the cheaper price justifies the rougher edges compared to GPT-4o or Sonnet.

Specs


Provider	xAI
Input cost	$2.00 / M tokens
Output cost	$10 / M tokens
Context window	131K tokens
Max output	131K tokens
Parameters	N/A
Features	function_calling, web_search

What it’s good at

Pricing

$2/M input is 60% cheaper than GPT-4o. For workloads where you’re feeding large amounts of context repeatedly — RAG pipelines, document analysis, summarization loops — that gap adds up fast.

Context window

131K tokens on both input and output is genuinely useful. You can load a substantial codebase or a long document thread without hitting truncation in the middle of something important.

Real-time data access

The native web search integration is one area where Grok consistently has an edge. If your agents need current information — news, prices, recent releases — this is already wired in rather than bolted on.

Where it falls short

API reliability

The xAI infrastructure is not in the same league as AWS or Azure for uptime and latency consistency. Plan for retry logic. High-traffic periods can mean connection resets or rate limit errors at inconvenient times.

Instruction following

It tends toward verbosity and can drift on strict JSON output formatting. GPT-4o is noticeably more precise on structured outputs, which matters when your tool-calling schema has tight constraints.

Best use cases with OpenClaw

Bulk content analysis — The $2/M input cost makes it practical to feed large datasets through OpenClaw agents for summarization or classification without budget anxiety.
Current events research — Agents tracking news, market data, or recent documentation benefit from the built-in web search rather than a separate retrieval step.

Not ideal for

Critical production systems — xAI is newer than the established cloud providers. If your uptime SLA is tight, this probably isn’t your primary model.
Complex tool pipelines — Function calling works, but it breaks down on deeply nested parameters more often than GPT-4o does.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Configure the provider using the OpenAI-compatible base URL at https://api.x.ai/v1 and ensure your API key is set in the environment variables.

{
  "models": {
    "mode": "merge",
    "providers": {
      "xai": {
        "baseUrl": "https://api.x.ai/v1",
        "apiKey": "YOUR-XAI-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "grok-2",
            "name": "Grok 2",
            "cost": {
              "input": 2,
              "output": 10
            },
            "contextWindow": 131072,
            "maxTokens": 131072
          }
        ]
      }
    }
  }
}

How it compares

vs GPT-4o — Grok 2 is cheaper ($2 vs $5 per million input tokens) but GPT-4o has better native tool-calling stability.
vs Claude 3.5 Sonnet — Sonnet is superior for coding tasks, but Grok 2 offers a larger output limit of 131K tokens versus Sonnet’s 8K cap.

Bottom line

Grok 2 makes sense for input-heavy, high-volume workloads where the cost savings outweigh occasional API instability. It’s not a replacement for GPT-4o in precise tool-calling scenarios, but for bulk processing it’s hard to argue with the price.

TRY GROK 2 ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.