What are the token costs?

Input costs $2 per million tokens, while output is priced at $8 per million tokens.

How much text can I send and receive?

You can send up to 1.0M tokens of context and receive up to 33K tokens in a single response.

GPT 4.1 for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. GPT-4.1’s headline feature is the 1M token context window — the largest in the OpenAI non-reasoning lineup. That 33K output ceiling is the trade-off. For ingestion-heavy agents that need to read a lot and write a little, it makes sense.

Specs


Provider	OpenAI
Input cost	$2.00 / M tokens
Output cost	$8.00 / M tokens
Context window	1.0M tokens
Max output	33K tokens
Parameters	N/A
Features	function_calling, vision

What it’s good at

Context Window

One million input tokens means you can feed in entire technical documentation sets or large codebases without chunking. This is the main reason to pick this model over anything else.

Tool Use

OpenAI’s function calling is consistent here. Schema violations are rare, which is important when agents are making sequential tool calls and a bad output breaks the chain.

Where it falls short

Output Cost

$8/M output tokens is expensive. Long-form generation gets costly quickly — this is a model for reading and reasoning, not bulk writing.

Output Ceiling

33K max output against a 1M input window feels like an intentional mismatch. You can take in an enormous amount of context but can’t generate proportionally large responses. Plan around it.

Best use cases with OpenClaw

Repository-wide Analysis — Feed dozens of source files to understand cross-file dependencies before making changes. The 1M window removes the need for complex chunking logic.
Visual Document Analysis — Vision plus large context is useful for multi-page PDFs where charts and text need to be read together.

Not ideal for

Simple Tasks — $2/M input is expensive for basic Q&A or FAQs. GPT-4o-mini at $0.15/M does the same job for a fraction of the cost.
Streaming Chat — Large context overhead can slow responses, which feels wrong in an interactive UI even if throughput is fine.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Set OPENAI_API_KEY and the framework routes to GPT-4.1 automatically.

export OPENAI_API_KEY="your-key-here"

That’s it. OpenClaw picks up OpenAI models automatically.

How it compares

vs Claude 3.5 Sonnet — Sonnet is better for creative and writing tasks; GPT-4.1 wins on context size and tool-calling consistency.
vs Gemini 1.5 Pro — Gemini goes up to 2M context, but GPT-4.1’s pricing is more predictable and its tool use is easier to debug in OpenClaw.

Bottom line

GPT-4.1 is the right pick when the agent needs to understand everything about a project before acting, and you don’t need it to write novels in response.

TRY GPT 4.1 ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.