Current as of March 2026. GLM-4.7 is Zhipu’s upgrade over the 4.6 line: 64K output tokens (down from 131K in 4.6, but still large), vision support added, and similar pricing at $0.40/$1.50 per million. The multimodal support is the key differentiator if you’re building OpenClaw agents that need to process images alongside text.
Specs
| Provider | Zhipu AI |
| Input cost | $0.40 / M tokens |
| Output cost | $1.50 / M tokens |
| Context window | 203K tokens |
| Max output | 64K tokens |
| Parameters | N/A |
| Features | function_calling, vision, reasoning |
What it’s good at
64K Output with Vision
64K out is more than most models in this price range, and the vision capability makes it versatile. An OpenClaw agent can process screenshots, diagrams, or image attachments without switching models.
Competitive Pricing for the Feature Set
$0.40/$1.50 for a model with vision, function calling, and 203K context is good value. Comparable Western models charge more.
Large Context Window
203K tokens covers most real-world document and codebase inputs without chunking.
Where it falls short
Latency Through Haimaker
Routing adds overhead. During peak hours, TTFT spikes are noticeable compared to native US providers.
Instruction Following at Long Context
It occasionally drifts from complex, multi-step system prompts. Not as reliable as Claude 3.5 Sonnet on elaborate behavioral constraints.
Best use cases with OpenClaw
- Automated Technical Writing — 64K output lets it write entire manuals or technical guides without truncating mid-section.
- Budget-Conscious RAG — 203K context at $0.40/M input is a good deal for scanning large document sets with vision attached.
Not ideal for
- Real-time Voice Agents — Latency variance from the Haimaker endpoint makes it unsuitable for sub-500ms response requirements.
- Mission-Critical Logic — For safety-sensitive or legal reasoning, GPT-4o provides a more consistent baseline.
Run it through Haimaker
Skip juggling API keys. One Haimaker key gives you access to every model on the platform. Tell OpenClaw:
Add Haimaker as a custom provider to my OpenClaw config. Use these details:
- Provider name: haimaker
- Base URL: https://api.haimaker.ai/v1
- API key: [PASTE YOUR HAIMAKER API KEY HERE]
- API type: openai-completions
Add the auto-router model:
- haimaker/auto (reasoning: false, context: 128000, max tokens: 32000)
Create an alias "auto" for easy switching. Apply the config when done.
Or skip model selection entirely — Haimaker’s auto-router picks the best model for each task so you don’t have to.
OpenClaw setup
Configure your OpenClaw provider to use the Base URL https://api.haimaker.ai/v1 and provide your Haimaker API key. Use the model identifier z-ai/glm-4.7 and ensure your timeout settings are increased to accommodate the large 64K output potential.
{
"models": {
"mode": "merge",
"providers": {
"z-ai": {
"baseUrl": "https://api.haimaker.ai/v1",
"apiKey": "YOUR-Z-AI-API-KEY",
"api": "openai-completions",
"models": [
{
"id": "glm-4.7",
"name": "GLM-4.7",
"cost": {
"input": 0.4,
"output": 1.5
},
"contextWindow": 202752,
"maxTokens": 64000
}
]
}
}
}
}
How it compares
- vs GLM-4.6 — 4.6 has a larger 131K output limit and costs slightly more on output ($1.75/M). Choose 4.7 if you need vision; choose 4.6 if you need the extra output headroom.
- vs Claude 3 Haiku — Haiku is faster for short interactions. GLM-4.7 wins on context window (203K vs 200K) and output length by a wide margin.
Bottom line
A solid choice when you need vision plus a large output window at a non-frontier price. The latency is the main thing to test in your environment before committing.
For setup instructions, see our API key guide. For all available models, see the complete models guide.