What is the context window size?

The model supports up to 128K tokens, which is roughly equivalent to 300 pages of text.

How much does it cost to use?

Input tokens are $0.15 per million and output tokens are $0.60 per million.

Does it support image inputs?

Yes, it has native vision capabilities for OCR and general image description tasks.

GPT 4o Mini for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. GPT-4o Mini killed GPT-3.5 Turbo. For most agentic tasks that don’t need serious reasoning, $0.15/M input is hard to argue with. It’s where I’d start any new OpenClaw project before deciding I need something heavier.

Specs


Provider	OpenAI
Input cost	$0.15 / M tokens
Output cost	$0.60 / M tokens
Context window	128K tokens
Max output	16K tokens
Parameters	N/A
Features	function_calling, vision

What it’s good at

Price

At $0.15/M input and $0.60/M output, it’s the cheapest way to get reliable OpenAI function calling. You can run a lot of agent turns before it becomes a line item worth caring about.

Solid Output Ceiling

16K max output is generous for a small model. Competitors in the same tier often cap out at 4K, which creates awkward chunking logic you don’t need here.

Function Calling

Follows tool schemas with enough consistency for production use. Not quite GPT-4o level, but close enough for most workflows.

Where it falls short

Reasoning Depth

Multi-step logical deduction falls apart — complex math, deep stack trace analysis, intricate architecture decisions. This is a pattern-matching model, not a thinking one.

Vision Detail

The vision support is there, but it misses fine-grained detail in complex images. If you need to read small text in a screenshot, step up to GPT-4o.

Best use cases with OpenClaw

High-Volume Classification — Thousands of categorization tasks per hour without significant cost. Good fit for the filtering layer of a larger agent pipeline.
Agentic Routing — Works well as the router node in an OpenClaw graph — quick, cheap decisions about which specialized agent handles a query.
Simple Data Extraction — Structured JSON from unstructured text is the sweet spot, as long as the schema isn’t deeply nested.

Not ideal for

Complex Software Engineering — Stack trace debugging and large codebase refactoring both require more reasoning depth than this model has.
Creative Writing — The output is repetitive and flat. Claude 3.5 Sonnet is a better pick for anything the user will actually read.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

OpenAI is the default provider in OpenClaw. Export the key, done.

export OPENAI_API_KEY="your-key-here"

That’s it. OpenClaw picks up OpenAI models automatically.

How it compares

vs Claude 3 Haiku — 4o Mini benchmarks higher and has a 16K output limit; Haiku caps at 4K and is faster on short prompts.
vs Gemini 1.5 Flash — Flash wins on context window size (1M vs 128K); 4o Mini is more consistent for structured tool use within OpenClaw.

Bottom line

Start here. If your agent works on 4o Mini, ship it. Only upgrade if you hit a real wall with reasoning or context — the cost savings are too good to skip.

TRY GPT 4O MINI ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.