Current as of April 2026. Hermes Agent requires a model that can maintain state across 15+ messaging platforms and execute 47+ built-in tools without hallucinating arguments. While many developers default to Claude for code, its real strength in Hermes is the strict adherence to tool schemas and the ability to manage persistent cross-session memory without drifting from the system prompt.
The quick answer
| Model | Input / Output | Context | Best For |
|---|---|---|---|
| Claude 3 Haiku | $0.25 / $1.25 | 200K | The high-volume message router |
| Claude 3.5 Haiku | $0.80 / $4.00 | 200K | The speed-first tool executor |
| Claude Haiku 4.5 | $1.00 / $5.00 | 200K | Reasoning for budget-conscious agents |
| Claude 3.7 Sonnet | $3.00 / $15 | 200K | The gold standard for autonomous workflows |
| Claude Sonnet 4 | $3.00 / $15 | 200K | The redundant middle child |
| Claude Sonnet 4.5 | $3.00 / $15 | 1M | The redundant middle child |
| Claude Sonnet 4.6 | $3.00 / $15 | 1M | The redundant middle child |
| Claude Opus 4.5 | $5.00 / $25 | 200K | The zero-failure autonomous brain |
Start with Claude 3.7 Sonnet unless you have a specific reason to pick another. At $3 per million input tokens, it offers the most reliable reasoning-to-cost ratio. The 64K output limit is more than enough for complex agentic loops, and its native reasoning capabilities ensure that tool-calling chains in Hermes don’t break during long-running workflows.
Claude 3 Haiku — The high-volume message router
This is the cheapest option at $0.25 per million input tokens. It is best used for simple Hermes tasks like basic message classification or routing across Telegram and Discord. It lacks the reasoning depth for complex multi-tool chains, so keep its tasks limited to single-step operations.
Claude 3.5 Haiku — The speed-first tool executor
For $0.80 per million input tokens, you get significantly better tool-calling reliability than the base 3 Haiku. It is the best choice if your Hermes instance needs to respond instantly to user commands across Slack or WhatsApp without the latency of larger models.
Claude Haiku 4.5 — Reasoning for budget-conscious agents
At $1 per million input tokens, this model introduces dedicated reasoning and a 64K output cap to the Haiku tier. It is the entry point for Hermes agents that need to think through tool selection before execution without jumping to the $3 price point of Sonnet.
Claude 3.7 Sonnet — The gold standard for autonomous workflows
This is the most balanced model for Hermes. It handles the 47+ built-in tools with high precision. The reasoning engine prevents the ‘looping’ behavior often seen in smaller models when an agent gets stuck on a specific task.
Claude Sonnet 4 — The redundant middle child
This model is nearly identical to 3.7 Sonnet in pricing and context. Unless you have a specific legacy requirement, prefer 3.7 Sonnet for its more refined reasoning or move to 4.5 for the expanded context window.
Claude Sonnet 4.5 — The redundant middle child
This model is nearly identical to 3.7 Sonnet in pricing and context. Unless you have a specific legacy requirement, prefer 3.7 Sonnet for its more refined reasoning or move to 4.5 for the expanded context window.
Claude Sonnet 4.6 — The redundant middle child
This model is nearly identical to 3.7 Sonnet in pricing and context. Unless you have a specific legacy requirement, prefer 3.7 Sonnet for its more refined reasoning or move to 4.5 for the expanded context window.
Claude Opus 4.5 — The zero-failure autonomous brain
At $5 per million input and $25 per million output, this is for mission-critical agents. Use this when Hermes is managing high-stakes deployments via SSH or Docker where a single tool-calling error could be catastrophic.
Setup in Hermes Agent
To integrate Claude with Hermes, run ‘hermes model’ in your terminal and select ‘Custom endpoint’. Use your Anthropic API key or a provider like OpenRouter. Ensure the base URL points to the /v1/chat/completions endpoint to maintain compatibility with Hermes’ tool-calling logic.
Running through haimaker.ai
Rather than standing up a per-provider account, you can point Hermes at haimaker.ai and get access to Claude alongside every other frontier model through one API key:
- Base URL:
https://api.haimaker.ai/v1 - Model:
anthropic/claude-3-haiku
Direct provider setup
Hermes makes custom endpoints easy. Run:
hermes model
Choose Custom endpoint from the menu. Enter the base URL and model identifier when prompted:
- Base URL:
https://api.haimaker.ai/v1 - Model:
anthropic/claude-3-haiku
Hermes stores the selection and uses it for all subsequent agent runs. You can also set HERMES_STREAM_READ_TIMEOUT and related env vars if you’re hitting slow providers.
Bottom line
For a standard Hermes Agent deployment, start with Claude 3.7 Sonnet for its reliability and reasoning. If you are just building a simple notification bot, Claude 3.5 Haiku will save you money without sacrificing much speed.
RUN CLAUDE IN HERMES WITH HAIMAKER
See our Hermes local-LLM setup guide.