Current as of March 2026. Grok 4.20 Beta is xAI’s play for the high-context market, offering a massive 2M token window at a fraction of the cost of flagship models. It is built for developers who need to ingest entire repositories without breaking the bank.
Specs
| Provider | xAI |
| Input cost | $2.00 / M tokens |
| Output cost | $6.00 / M tokens |
| Context window | 2M tokens |
| Max output | N/A tokens |
| Parameters | N/A |
| Features | function_calling, vision, reasoning, web_search |
What it’s good at
Massive Context Window
The 2M token capacity allows for processing massive datasets or entire codebases in a single prompt.
Aggressive Pricing
At $2 per million input tokens, it competes with mini models while offering significantly higher limits.
Where it falls short
Beta Instability
Expect occasional inconsistencies in reasoning and output formatting during this beta phase.
Proprietary Black Box
Zero transparency regarding architecture or training data makes it difficult to predict edge-case failures.
Best use cases with OpenClaw
- Large-scale codebase auditing — You can dump 1.5M tokens of source code into a single request for global analysis.
- Bulk data extraction — The $6 per million output cost makes high-volume transformation tasks economically viable.
Not ideal for
- Production-critical logic — The Beta tag implies reliability can fluctuate during rapid update cycles.
- Latency-sensitive UI features — Reasoning overhead on large contexts can lead to unpredictable time-to-first-token.
OpenClaw setup
Use the OpenAI-compatible provider setting with the base URL api.x.ai/v1 and your xAI API key. Set high timeout values to accommodate the processing time required for 2M token windows.
{
"models": {
"mode": "merge",
"providers": {
"xai": {
"baseUrl": "https://api.x.ai/v1",
"apiKey": "YOUR-XAI-API-KEY",
"api": "openai-completions",
"models": [
{
"id": "grok-4.20-beta",
"name": "Grok 4.20 Beta",
"cost": {
"input": 2,
"output": 6
},
"contextWindow": 2000000,
"maxTokens": null
}
]
}
}
}
}
How it compares
- vs GPT-4o-mini — Grok 4.20 Beta offers 15 times the context window at 2M tokens versus 128k for a similar price tier.
- vs Claude 3.5 Sonnet — Sonnet has superior reasoning but costs $3 per million input and is limited to a 200k context window.
Bottom line
It is a context-first model that is hard to beat for bulk processing if you can tolerate the occasional beta-related quirk.
TRY GROK 4.20 BETA ON HAIMAKER
For setup instructions, see our API key guide. For all available models, see the complete models guide.