What is the token limit?

The model supports a 128,000 token context window and can generate up to 16,000 tokens in a single response.

How much does it cost to run?

Input tokens cost $2.50 per million, while output tokens are priced at $10.00 per million.

GPT 4o for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. GPT-4o is OpenAI’s multimodal workhorse — not the smartest model they make, but probably the most consistent one. The 128K context with 16K output is a real constraint to plan around, but function calling reliability makes it a solid default for OpenClaw agents.

Specs


Provider	OpenAI
Input cost	$2.50 / M tokens
Output cost	$10 / M tokens
Context window	128K tokens
Max output	16K tokens
Parameters	N/A
Features	function_calling, vision

What it’s good at

Reliable Function Calling

GPT-4o follows JSON schemas more consistently than almost anything else at this price point. In OpenClaw’s tool-based loops, that consistency matters — broken tool calls cascade into broken agents.

Vision Without a Separate Model

Vision is baked into the base model, not bolted on. For reading UI screenshots or parsing diagrams mid-workflow, this removes a lot of orchestration complexity.

Where it falls short

Output Is Expensive

$10 per million output tokens is four times the input rate. If your agents generate long responses, costs add up fast. It’s worth profiling your actual output token usage before committing to this model at scale.

Safety Refusals

GPT-4o trips on benign system prompts more often than I’d like, especially compared to open-weight alternatives. If your agent needs aggressive role-setting, expect to iterate on prompts.

Best use cases with OpenClaw

Agentic Tool Use — Rarely misformats a tool call, which is what you care about in multi-step agent workflows.
Visual Reasoning — Useful when you need to analyze documents with embedded charts or read UI state without a separate vision step.

Not ideal for

Simple Data Extraction — GPT-4o-mini costs $0.15/M input vs $2.50/M here and handles basic extraction just as well. Hard to justify the price difference.
Creative Writing — The prose is serviceable but formulaic. Claude 3.5 Sonnet produces noticeably better output for anything user-facing.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Export your OPENAI_API_KEY and you’re done. No extra config needed.

export OPENAI_API_KEY="your-key-here"

That’s it. OpenClaw picks up OpenAI models automatically.

How it compares

vs Claude 3.5 Sonnet — Sonnet is better for complex coding and creative tasks; GPT-4o wins on function calling speed and reliability.
vs Llama 3.1 405B — Llama is worth considering for self-hosted setups, but GPT-4o’s native vision and lower latency make more sense for interactive agents.

Bottom line

GPT-4o is the right default when you need a production agent that won’t surprise you. Not the cheapest, not the smartest — but dependably correct.

TRY GPT 4O ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.