What is the exact cost per million tokens?

Input tokens cost $0.4 per million and output tokens cost $1.5 per million.

Does it support vision and function calling?

Yes, it supports both native vision processing and tool use/function calling within the OpenClaw framework.

What is the maximum context and output length?

The model supports a 203,000 token context window and can output up to 64,000 tokens in a single response.

GLM-4.7 for OpenClaw: Pricing, Setup, and What It's Good At

Current as of March 2026. GLM-4.7 is Zhipu’s upgrade over the 4.6 line: 64K output tokens (down from 131K in 4.6, but still large), vision support added, and similar pricing at $0.40/$1.50 per million. The multimodal support is the key differentiator if you’re building OpenClaw agents that need to process images alongside text.

Specs


Provider	Zhipu AI
Input cost	$0.40 / M tokens
Output cost	$1.50 / M tokens
Context window	203K tokens
Max output	64K tokens
Parameters	N/A
Features	function_calling, vision, reasoning

What it’s good at

64K Output with Vision

64K out is more than most models in this price range, and the vision capability makes it versatile. An OpenClaw agent can process screenshots, diagrams, or image attachments without switching models.

Competitive Pricing for the Feature Set

$0.40/$1.50 for a model with vision, function calling, and 203K context is good value. Comparable Western models charge more.

Large Context Window

203K tokens covers most real-world document and codebase inputs without chunking.

Where it falls short

Latency Through Haimaker

Routing adds overhead. During peak hours, TTFT spikes are noticeable compared to native US providers.

Instruction Following at Long Context

It occasionally drifts from complex, multi-step system prompts. Not as reliable as Claude 3.5 Sonnet on elaborate behavioral constraints.

Best use cases with OpenClaw

Automated Technical Writing — 64K output lets it write entire manuals or technical guides without truncating mid-section.
Budget-Conscious RAG — 203K context at $0.40/M input is a good deal for scanning large document sets with vision attached.

Not ideal for

Real-time Voice Agents — Latency variance from the Haimaker endpoint makes it unsuitable for sub-500ms response requirements.
Mission-Critical Logic — For safety-sensitive or legal reasoning, GPT-4o provides a more consistent baseline.

Run it through Haimaker

Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:

Add Haimaker as a custom provider to my OpenClaw config. Use these details:

- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions

Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)

Create an alias "auto" for easy switching. Apply the config when done.

Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.

OpenClaw setup

Configure your OpenClaw provider to use the Base URL https://api.haimaker.ai/v1 and provide your Haimaker API key. Use the model identifier z-ai/glm-4.7 and ensure your timeout settings are increased to accommodate the large 64K output potential.

{
  "models": {
    "mode": "merge",
    "providers": {
      "z-ai": {
        "baseUrl": "https://api.haimaker.ai/v1",
        "apiKey": "YOUR-Z-AI-API-KEY",
        "api": "openai-completions",
        "models": [
          {
            "id": "glm-4.7",
            "name": "GLM-4.7",
            "cost": {
              "input": 0.4,
              "output": 1.5
            },
            "contextWindow": 202752,
            "maxTokens": 64000
          }
        ]
      }
    }
  }
}

How it compares

vs GLM-4.6 — 4.6 has a larger 131K output limit and costs slightly more on output ($1.75/M). Choose 4.7 if you need vision; choose 4.6 if you need the extra output headroom.
vs Claude 3 Haiku — Haiku is faster for short interactions. GLM-4.7 wins on context window (203K vs 200K) and output length by a wide margin.

Bottom line

A solid choice when you need vision plus a large output window at a non-frontier price. The latency is the main thing to test in your environment before committing.

TRY GLM-4.7 ON HAIMAKER

For setup instructions, see our API key guide. For all available models, see the complete models guide.