Hermes Agent ships with a short list of built-in providers, but its real flexibility is the Custom endpoint option: any service that speaks the OpenAI chat-completions format plugs in with a base URL, a key, and a model name. That covers MiniMax, DeepSeek, Google Gemini, xAI Grok, OpenAI, OpenRouter, a private vLLM box, a local Ollama instance, and haimaker.ai for all of them at once.
Here’s the general setup, then the exact values for the providers people ask about most.
The general method
Every custom provider in Hermes goes through the same three-question flow:
hermes model
- Pick Custom endpoint from the menu.
- Enter the base URL — the OpenAI-compatible root, ending in
/v1(Hermes appends/chat/completionsitself). - Enter your API key for that provider.
- Enter the model identifier the provider expects.
Hermes saves the selection and uses it for every subsequent run. To change providers later, run hermes model again. That’s the whole mechanism — the rest of this post is just filling in the right values.
A model only works well in Hermes if it follows tool schemas and carries enough context for the agent’s loops — aim for at least ~64K usable context. See Best Models for Hermes Agent for which model to actually pick.
haimaker.ai — one key for every model
The least painful option: instead of a separate account, key, and base URL per provider, point Hermes at haimaker.ai once and switch models by changing one string.
- Base URL:
https://api.haimaker.ai/v1 - API key: your haimaker.ai key, from app.haimaker.ai
- Model: any supported model, e.g.
anthropic/claude-sonnet-4-6,openai/gpt-5-4-codex,google/gemini-3-1-pro,minimax/minimax-m2-5,deepseek/deepseek-v3-2,xai/grok-4-1-fast,zai/glm-4-7,moonshot/kimi-k2-5
Models run about 5% below market rate, and you only manage one billing account. Full walkthrough at the end of this post.
MiniMax in Hermes
MiniMax M2.5 is the popular budget choice for Hermes — cheap enough to leave running, capable enough to handle tool loops.
- Base URL:
https://api.minimax.io/v1 - API key: from your MiniMax platform console
- Model:
MiniMax-M2.5(use the exact ID shown in the MiniMax console — casing matters)
Or get MiniMax through haimaker.ai with model minimax/minimax-m2-5 and skip the separate account.
Google Gemini in Hermes
Gemini exposes an OpenAI-compatible layer, so it works as a Hermes custom endpoint.
- Base URL:
https://generativelanguage.googleapis.com/v1beta/openai - API key: your Google AI Studio key
- Model:
gemini-3-flashfor cheap, fast work;gemini-3.1-profor long-context research
Yes, Hermes Agent supports Google Gemini this way — there’s no separate Gemini integration to enable, it’s just a custom endpoint. Or use google/gemini-3-1-pro through haimaker.ai.
DeepSeek in Hermes
- Base URL:
https://api.deepseek.com/v1 - API key: from the DeepSeek platform
- Model:
deepseek-chat(points at the current V3.2 line) ordeepseek-reasoner
Through haimaker.ai: deepseek/deepseek-v3-2.
xAI Grok in Hermes
- Base URL:
https://api.x.ai/v1 - API key: from the xAI console
- Model:
grok-4.1-fastfor cheap large-context work,grok-code-fastfor code-specific tasks
Through haimaker.ai: xai/grok-4-1-fast.
OpenAI / Codex models in Hermes
If you want OpenAI’s own models (including the Codex line) behind Hermes:
- Base URL:
https://api.openai.com/v1 - API key: an OpenAI API key (a ChatGPT Plus subscription does not include API access — you need a separate API key with billing enabled)
- Model:
gpt-5.4-codexfor coding,gpt-5.4for general work
Through haimaker.ai: openai/gpt-5-4-codex.
Local models via Ollama
- Base URL:
http://localhost:11434/v1 - API key: any non-empty string (Ollama ignores it, but Hermes wants a value)
- Model:
gemma4:latest,qwen3.5:latest, or whatever you’ve pulled
ollama pull gemma4
hermes model # Custom endpoint → base URL above → model gemma4:latest
Common problems
- Timeouts on long steps. Slower providers can exceed Hermes’ default stream read timeout during big agentic turns. Set
HERMES_STREAM_READ_TIMEOUT(seconds) to a larger value before launching Hermes. - 401 / auth errors. The key didn’t register, or you pasted a console/session token instead of an API key. Re-run
hermes modeland re-enter it. - 404 on requests. Your base URL probably includes
/chat/completionsalready, or has a trailing slash. Hermes appends the path itself — end the URL at/v1. - Tool calls failing repeatedly. That’s usually the model, not the config. Small models drift on tool schemas; move up a tier.
Set up haimaker.ai with Hermes Agent
Step by step, for the one-key path:
-
Create an account and copy an API key at app.haimaker.ai. New accounts get free credits to test with.
-
Run:
hermes model -
Choose Custom endpoint.
-
Enter:
- Base URL:
https://api.haimaker.ai/v1 - API key: your haimaker.ai key
- Model: whichever you want — e.g.
minimax/minimax-m2-5to start cheap, oranthropic/claude-sonnet-4-6for the reliable default
- Base URL:
-
Run
hermes. To switch models later, runhermes modelagain and change the model string — the base URL and key don’t change.
Browse every supported model with live pricing and benchmarks at haimaker.ai.
GET $10 FREE CREDITS ON HAIMAKER
Related: Best Models for Hermes Agent · Hermes Agent Pricing · Hermes Agent vs Codex CLI