What is the exact pricing for Gemini 2.5 Pro?

Input tokens cost $1.25 per million, while output tokens are priced at $10 per million.

How large is the context window?

The model supports up to 1,000,000 tokens for input, though the output remains limited to 8,000 tokens.

Does it support function calling?

Yes, it has native support for function calling and vision, which are fully compatible with OpenClaw's agentic workflows.

Gemini 2.5 Pro for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. Gemini 2.5 Pro’s main story is the 1M context window. For $1.25/M input, you can feed it an entire codebase, a stack of PDFs, or a long video file and get coherent analysis back — without building a RAG pipeline to manage it.

Specs


Provider	Google
Input cost	$1.25 / M tokens
Output cost	$10 / M tokens
Context window	1.0M tokens
Max output	8K tokens
Parameters	N/A
Features	function_calling, vision

What it’s good at

1M context window

This is the reason to use it. Holding an entire repository or hundreds of documents in a single prompt changes what’s architecturally possible — you skip a lot of retrieval complexity and the coherence doesn’t degrade the way it does when you’re stitching together chunked results.

Vision and OCR

Strong at extracting structured data from dense diagrams, charts, and screenshots. Fast and accurate enough for production-grade document processing pipelines.

Where it falls short

8K output cap

The mismatch between 1M input and 8K output is genuinely frustrating. You can analyze an entire codebase in one prompt, but if you want the model to produce more than a few files of output, you’ll be splitting across multiple calls anyway.

Latency at large context

Push the context toward the upper limit and response times slow noticeably. Not a problem for batch analysis, but it rules out real-time applications.

Best use cases with OpenClaw

Full repository analysis — The whole project fits in context. Ask architectural questions, trace dependencies, find bugs across files — the kind of analysis that falls apart with a 128K window.
Video and image auditing — Native multimodal support for searching through long videos or large image sets at a fraction of what you’d pay for a Vision API endpoint.

Not ideal for

Interactive chat — Time-to-first-token at scale is too slow for anything conversational.
Code generation — The 8K output limit is the wrong constraint for generating meaningful amounts of code. Look at models with larger output windows for that.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Configure the Google provider in your OpenClaw settings with a valid Gemini API key. Set maxTokens explicitly to 8192 — leaving it at the default can cause truncated responses.

{
  "models": {
    "mode": "merge",
    "providers": {
      "google": {
        "baseUrl": "https://generativelanguage.googleapis.com/v1beta",
        "apiKey": "YOUR-GOOGLE-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "gemini-2.5-pro",
            "name": "Gemini 2.5 Pro",
            "cost": {
              "input": 1.25,
              "output": 10
            },
            "contextWindow": 1048576,
            "maxTokens": 8192
          }
        ]
      }
    }
  }
}

How it compares

vs GPT-4o — GPT-4o is faster and more consistent on short-form reasoning. Its 128K context window is about 8x smaller than Gemini’s 1M, which matters for document-heavy workloads.
vs Claude 3.5 Sonnet — Claude follows complex instructions more reliably and is better at code generation, but it costs more per input token and tops out at 200K context.

Bottom line

When your input exceeds 200K tokens, this is where you go. The $10/M output cost means you don’t want to use it for generation-heavy work — it’s an analysis and comprehension model.

TRY GEMINI 2.5 PRO ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.