Hermes Agent ships with a short list of built-in providers, but its real flexibility is the Custom endpoint option: any service that speaks the OpenAI chat-completions format plugs in with a base URL, a key, and a model name. That covers MiniMax, DeepSeek, Google Gemini, xAI Grok, OpenAI, OpenRouter, a private vLLM box, a local Ollama instance, and haimaker.ai for all of them at once.

Here’s the general setup, then the exact values for the providers people ask about most.

The general method

Every custom provider in Hermes goes through the same three-question flow:

hermes model
  1. Pick Custom endpoint from the menu.
  2. Enter the base URL — the OpenAI-compatible root, ending in /v1 (Hermes appends /chat/completions itself).
  3. Enter your API key for that provider.
  4. Enter the model identifier the provider expects.

Hermes saves the selection and uses it for every subsequent run. To change providers later, run hermes model again. That’s the whole mechanism — the rest of this post is just filling in the right values.

A model only works well in Hermes if it follows tool schemas and carries enough context for the agent’s loops — aim for at least ~64K usable context. See Best Models for Hermes Agent for which model to actually pick.

haimaker.ai — one key for every model

The least painful option: instead of a separate account, key, and base URL per provider, point Hermes at haimaker.ai once and switch models by changing one string.

  • Base URL: https://api.haimaker.ai/v1
  • API key: your haimaker.ai key, from app.haimaker.ai
  • Model: any supported model, e.g. anthropic/claude-sonnet-4-6, openai/gpt-5-4-codex, google/gemini-3-1-pro, minimax/minimax-m2-5, deepseek/deepseek-v3-2, xai/grok-4-1-fast, zai/glm-4-7, moonshot/kimi-k2-5

Models run about 5% below market rate, and you only manage one billing account. Full walkthrough at the end of this post.

MiniMax in Hermes

MiniMax M2.5 is the popular budget choice for Hermes — cheap enough to leave running, capable enough to handle tool loops.

  • Base URL: https://api.minimax.io/v1
  • API key: from your MiniMax platform console
  • Model: MiniMax-M2.5 (use the exact ID shown in the MiniMax console — casing matters)

Or get MiniMax through haimaker.ai with model minimax/minimax-m2-5 and skip the separate account.

Google Gemini in Hermes

Gemini exposes an OpenAI-compatible layer, so it works as a Hermes custom endpoint.

  • Base URL: https://generativelanguage.googleapis.com/v1beta/openai
  • API key: your Google AI Studio key
  • Model: gemini-3-flash for cheap, fast work; gemini-3.1-pro for long-context research

Yes, Hermes Agent supports Google Gemini this way — there’s no separate Gemini integration to enable, it’s just a custom endpoint. Or use google/gemini-3-1-pro through haimaker.ai.

DeepSeek in Hermes

  • Base URL: https://api.deepseek.com/v1
  • API key: from the DeepSeek platform
  • Model: deepseek-chat (points at the current V3.2 line) or deepseek-reasoner

Through haimaker.ai: deepseek/deepseek-v3-2.

xAI Grok in Hermes

  • Base URL: https://api.x.ai/v1
  • API key: from the xAI console
  • Model: grok-4.1-fast for cheap large-context work, grok-code-fast for code-specific tasks

Through haimaker.ai: xai/grok-4-1-fast.

OpenAI / Codex models in Hermes

If you want OpenAI’s own models (including the Codex line) behind Hermes:

  • Base URL: https://api.openai.com/v1
  • API key: an OpenAI API key (a ChatGPT Plus subscription does not include API access — you need a separate API key with billing enabled)
  • Model: gpt-5.4-codex for coding, gpt-5.4 for general work

Through haimaker.ai: openai/gpt-5-4-codex.

Local models via Ollama

  • Base URL: http://localhost:11434/v1
  • API key: any non-empty string (Ollama ignores it, but Hermes wants a value)
  • Model: gemma4:latest, qwen3.5:latest, or whatever you’ve pulled
ollama pull gemma4
hermes model   # Custom endpoint → base URL above → model gemma4:latest

Common problems

  • Timeouts on long steps. Slower providers can exceed Hermes’ default stream read timeout during big agentic turns. Set HERMES_STREAM_READ_TIMEOUT (seconds) to a larger value before launching Hermes.
  • 401 / auth errors. The key didn’t register, or you pasted a console/session token instead of an API key. Re-run hermes model and re-enter it.
  • 404 on requests. Your base URL probably includes /chat/completions already, or has a trailing slash. Hermes appends the path itself — end the URL at /v1.
  • Tool calls failing repeatedly. That’s usually the model, not the config. Small models drift on tool schemas; move up a tier.

Set up haimaker.ai with Hermes Agent

Step by step, for the one-key path:

  1. Create an account and copy an API key at app.haimaker.ai. New accounts get free credits to test with.

  2. Run:

    hermes model
    
  3. Choose Custom endpoint.

  4. Enter:

    • Base URL: https://api.haimaker.ai/v1
    • API key: your haimaker.ai key
    • Model: whichever you want — e.g. minimax/minimax-m2-5 to start cheap, or anthropic/claude-sonnet-4-6 for the reliable default
  5. Run hermes. To switch models later, run hermes model again and change the model string — the base URL and key don’t change.

Browse every supported model with live pricing and benchmarks at haimaker.ai.

GET $10 FREE CREDITS ON HAIMAKER


Related: Best Models for Hermes Agent · Hermes Agent Pricing · Hermes Agent vs Codex CLI