Current as of April 2026. Hermes Agent requires a model that can maintain state across 15+ messaging platforms and execute 47+ built-in tools without hallucinating arguments. While many developers default to Claude for code, its real strength in Hermes is the strict adherence to tool schemas and the ability to manage persistent cross-session memory without drifting from the system prompt.

The quick answer

ModelInput / OutputContextBest For
Claude 3 Haiku$0.25 / $1.25200KThe high-volume message router
Claude 3.5 Haiku$0.80 / $4.00200KThe speed-first tool executor
Claude Haiku 4.5$1.00 / $5.00200KReasoning for budget-conscious agents
Claude 3.7 Sonnet$3.00 / $15200KThe gold standard for autonomous workflows
Claude Sonnet 4$3.00 / $15200KThe redundant middle child
Claude Sonnet 4.5$3.00 / $151MThe redundant middle child
Claude Sonnet 4.6$3.00 / $151MThe redundant middle child
Claude Opus 4.5$5.00 / $25200KThe zero-failure autonomous brain

Start with Claude 3.7 Sonnet unless you have a specific reason to pick another. At $3 per million input tokens, it offers the most reliable reasoning-to-cost ratio. The 64K output limit is more than enough for complex agentic loops, and its native reasoning capabilities ensure that tool-calling chains in Hermes don’t break during long-running workflows.

Claude 3 Haiku — The high-volume message router

This is the cheapest option at $0.25 per million input tokens. It is best used for simple Hermes tasks like basic message classification or routing across Telegram and Discord. It lacks the reasoning depth for complex multi-tool chains, so keep its tasks limited to single-step operations.

Claude 3.5 Haiku — The speed-first tool executor

For $0.80 per million input tokens, you get significantly better tool-calling reliability than the base 3 Haiku. It is the best choice if your Hermes instance needs to respond instantly to user commands across Slack or WhatsApp without the latency of larger models.

Claude Haiku 4.5 — Reasoning for budget-conscious agents

At $1 per million input tokens, this model introduces dedicated reasoning and a 64K output cap to the Haiku tier. It is the entry point for Hermes agents that need to think through tool selection before execution without jumping to the $3 price point of Sonnet.

Claude 3.7 Sonnet — The gold standard for autonomous workflows

This is the most balanced model for Hermes. It handles the 47+ built-in tools with high precision. The reasoning engine prevents the ‘looping’ behavior often seen in smaller models when an agent gets stuck on a specific task.

Claude Sonnet 4 — The redundant middle child

This model is nearly identical to 3.7 Sonnet in pricing and context. Unless you have a specific legacy requirement, prefer 3.7 Sonnet for its more refined reasoning or move to 4.5 for the expanded context window.

Claude Sonnet 4.5 — The redundant middle child

This model is nearly identical to 3.7 Sonnet in pricing and context. Unless you have a specific legacy requirement, prefer 3.7 Sonnet for its more refined reasoning or move to 4.5 for the expanded context window.

Claude Sonnet 4.6 — The redundant middle child

This model is nearly identical to 3.7 Sonnet in pricing and context. Unless you have a specific legacy requirement, prefer 3.7 Sonnet for its more refined reasoning or move to 4.5 for the expanded context window.

Claude Opus 4.5 — The zero-failure autonomous brain

At $5 per million input and $25 per million output, this is for mission-critical agents. Use this when Hermes is managing high-stakes deployments via SSH or Docker where a single tool-calling error could be catastrophic.

Setup in Hermes Agent

To integrate Claude with Hermes, run ‘hermes model’ in your terminal and select ‘Custom endpoint’. Use your Anthropic API key or a provider like OpenRouter. Ensure the base URL points to the /v1/chat/completions endpoint to maintain compatibility with Hermes’ tool-calling logic.

Running through haimaker.ai

Rather than standing up a per-provider account, you can point Hermes at haimaker.ai and get access to Claude alongside every other frontier model through one API key:

  • Base URL: https://api.haimaker.ai/v1
  • Model: anthropic/claude-3-haiku

Direct provider setup

Hermes makes custom endpoints easy. Run:

hermes model

Choose Custom endpoint from the menu. Enter the base URL and model identifier when prompted:

  • Base URL: https://api.haimaker.ai/v1
  • Model: anthropic/claude-3-haiku

Hermes stores the selection and uses it for all subsequent agent runs. You can also set HERMES_STREAM_READ_TIMEOUT and related env vars if you’re hitting slow providers.

Bottom line

For a standard Hermes Agent deployment, start with Claude 3.7 Sonnet for its reliability and reasoning. If you are just building a simple notification bot, Claude 3.5 Haiku will save you money without sacrificing much speed.

RUN CLAUDE IN HERMES WITH HAIMAKER


See our Hermes local-LLM setup guide.