What is the exact pricing for GPT 5 Mini?

Input tokens are priced at $0.25 per million and output tokens are $2 per million.

How large of a document can I process?

The model supports up to 272K tokens in the context window, which is roughly 200,000 words.

Does it support vision and tool use?

Yes, it supports native vision processing and reliable function calling out of the box.

GPT 5 Mini for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. GPT-5 Mini sits in an interesting spot: 272K context window and 128K output at $0.25/M input. That context-to-price ratio is hard to beat for tasks where you’d otherwise need to build a RAG pipeline.

Specs


Provider	OpenAI
Input cost	$0.25 / M tokens
Output cost	$2.00 / M tokens
Context window	272K tokens
Max output	128K tokens
Parameters	N/A
Features	function_calling, vision, reasoning

What it’s good at

Context-to-Price Ratio

$0.25/M input for a 272K window means you can often skip chunking entirely and just dump full documents in. That’s a real engineering simplification.

Output Ceiling

128K max output on a “mini” model is unusual. Long-form code generation or document synthesis that would require chunking on smaller models often just works here.

Tool Use

OpenAI’s function calling is reliable at this tier. Fewer schema violations than comparable models from other providers in the same price bracket.

Where it falls short

Output Multiplier

Output costs $2/M — 8x the input rate. Verbose agent loops can generate surprisingly large bills. Worth monitoring closely.

Long-Context Degradation

Instruction following starts slipping past the 200K token mark in my experience. The 272K window is real but don’t rely on the far end of it for precise tasks.

Best use cases with OpenClaw

Multi-Document Analysis — Dump multiple PDFs or a full repo into a single prompt for cross-referencing. Much cheaper than building retrieval infrastructure.
High-Frequency Orchestration — Good for the coordinator node in a multi-agent OpenClaw graph where many small decisions happen per minute.

Not ideal for

Deep Architectural Reasoning — The reasoning here is lighter than full GPT-5. It misses edge cases that require real conceptual depth.
User-Facing Copy — The prose is noticeably more wooden than Anthropic’s models. Fine for internal tooling, needs editing before going public.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Set the API key and you’re done.

export OPENAI_API_KEY="your-key-here"

That’s it. OpenClaw picks up OpenAI models automatically.

How it compares

vs Claude 3 Haiku — Haiku is faster for short prompts; GPT-5 Mini has a larger context window (272K vs 200K) and better reasoning depth.
vs Gemini 1.5 Flash — Flash wins on raw context size (1M), but GPT-5 Mini’s function calling is more consistent within OpenClaw.

Bottom line

GPT-5 Mini is a strong default for production agents that need large context without a large bill — as long as you watch your output token usage.

TRY GPT 5 MINI ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.