What is the token limit?

The model supports a 1.0M token context window and a maximum output of 33K tokens per request.

How much does it cost?

Inputs are priced at $0.4 per million tokens and outputs at $1.6 per million tokens.

GPT 4.1 Mini for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. GPT-4.1 Mini takes the 1M context window from its full-size sibling and drops the price to $0.40/M input. That’s a significant reduction. The trade-offs are real but manageable for the right use cases.

Specs


Provider	OpenAI
Input cost	$0.40 / M tokens
Output cost	$1.60 / M tokens
Context window	1.0M tokens
Max output	33K tokens
Parameters	N/A
Features	function_calling, vision

What it’s good at

Context Window

Same 1M token window as GPT-4.1 at a fifth of the input cost. For use cases where you’re mostly feeding in large documents and want simple answers back, this is the economical path.

Tool Use

Function calling is solid. OpenAI’s schema consistency transfers down to the mini tier reasonably well.

Vision at a Reasonable Price

Vision support at $1.60/M output is much easier to justify than $8/M. Good for UI automation agents or document processing pipelines with image content.

Where it falls short

Reasoning

Multi-step logic is the weak point. Complex, layered prompts that require the model to track multiple constraints can fail in non-obvious ways.

Latency at Max Output

When you push against the 33K output limit, latency gets inconsistent. Not a dealbreaker for async workloads, but it’s noticeable.

Best use cases with OpenClaw

Large Document Analysis — Multiple PDFs or legal documents in a single context is where the 1M window earns its keep.
Vision-based Automation — Screen-reading and UI interaction agents where you need vision support but can’t afford GPT-4.1’s output pricing.

Not ideal for

Complex Refactoring — Deep cross-file logic analysis needs more reasoning capacity than this model has.
Sub-second Chat — If you need instant responses for a conversational UI, something smaller and faster fits better.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Set OPENAI_API_KEY and the framework handles the rest.

export OPENAI_API_KEY="your-key-here"

That’s it. OpenClaw picks up OpenAI models automatically.

How it compares

vs Claude 3 Haiku — Haiku is cheaper and faster for short prompts; 4.1 Mini’s 1M context window dwarfs Haiku’s 200K limit for document-heavy work.
vs Gemini 1.5 Flash — Flash matches the context window, but OpenAI’s function calling is more consistent and easier to debug within OpenClaw.

Bottom line

GPT-4.1 Mini is the practical choice for high-context agents that need vision and reliable tool calling without the full GPT-4.1 price tag.

TRY GPT 4.1 MINI ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.