Does Hermes support Claude's native tool-calling?

Yes, Hermes uses OpenAI-compatible headers, but when using Claude through a proxy or direct endpoint, it maps the tool definitions to Claude's function calling schema automatically.

Which model is best for local Docker management via Hermes?

Claude 3.7 Sonnet is the best choice here. It provides the necessary reasoning to understand file systems and container states without the extreme cost of Opus.

Can I use the 1M context window on mobile messaging platforms?

Yes, but be careful with costs. While Sonnet 4.5 and 4.6 support 1M context, sending that much data back and forth on every Telegram message will get expensive quickly.

Best Claude Models for Hermes Agent (2026): How to Pick

Current as of April 2026. Hermes Agent requires a model that can maintain state across 15+ messaging platforms and execute 47+ built-in tools without hallucinating arguments. While many developers default to Claude for code, its real strength in Hermes is the strict adherence to tool schemas and the ability to manage persistent cross-session memory without drifting from the system prompt.

The quick answer

Model	Input / Output	Context	Best For
Claude 3 Haiku	$0.25 / $1.25	200K	The high-volume message router
Claude 3.5 Haiku	$0.80 / $4.00	200K	The speed-first tool executor
Claude Haiku 4.5	$1.00 / $5.00	200K	Reasoning for budget-conscious agents
Claude 3.7 Sonnet	$3.00 / $15	200K	The gold standard for autonomous workflows
Claude Sonnet 4	$3.00 / $15	200K	The redundant middle child
Claude Sonnet 4.5	$3.00 / $15	1M	The redundant middle child
Claude Sonnet 4.6	$3.00 / $15	1M	The redundant middle child
Claude Opus 4.5	$5.00 / $25	200K	The zero-failure autonomous brain

Start with Claude 3.7 Sonnet unless you have a specific reason to pick another. At $3 per million input tokens, it offers the most reliable reasoning-to-cost ratio. The 64K output limit is more than enough for complex agentic loops, and its native reasoning capabilities ensure that tool-calling chains in Hermes don’t break during long-running workflows.

Claude 3 Haiku — The high-volume message router

This is the cheapest option at $0.25 per million input tokens. It is best used for simple Hermes tasks like basic message classification or routing across Telegram and Discord. It lacks the reasoning depth for complex multi-tool chains, so keep its tasks limited to single-step operations.

Claude 3.5 Haiku — The speed-first tool executor

For $0.80 per million input tokens, you get significantly better tool-calling reliability than the base 3 Haiku. It is the best choice if your Hermes instance needs to respond instantly to user commands across Slack or WhatsApp without the latency of larger models.

Claude Haiku 4.5 — Reasoning for budget-conscious agents

At $1 per million input tokens, this model introduces dedicated reasoning and a 64K output cap to the Haiku tier. It is the entry point for Hermes agents that need to think through tool selection before execution without jumping to the $3 price point of Sonnet.

Claude 3.7 Sonnet — The gold standard for autonomous workflows

This is the most balanced model for Hermes. It handles the 47+ built-in tools with high precision. The reasoning engine prevents the ‘looping’ behavior often seen in smaller models when an agent gets stuck on a specific task.

Claude Sonnet 4 — The redundant middle child

This model is nearly identical to 3.7 Sonnet in pricing and context. Unless you have a specific legacy requirement, prefer 3.7 Sonnet for its more refined reasoning or move to 4.5 for the expanded context window.

Claude Sonnet 4.5 — The redundant middle child

Claude Sonnet 4.6 — The redundant middle child

Claude Opus 4.5 — The zero-failure autonomous brain

At $5 per million input and $25 per million output, this is for mission-critical agents. Use this when Hermes is managing high-stakes deployments via SSH or Docker where a single tool-calling error could be catastrophic.

Setup in Hermes Agent

To integrate Claude with Hermes, run ‘hermes model’ in your terminal and select ‘Custom endpoint’. Use your Anthropic API key or a provider like OpenRouter. Ensure the base URL points to the /v1/chat/completions endpoint to maintain compatibility with Hermes’ tool-calling logic.

Running through haimaker.ai

Rather than standing up a per-provider account, you can point Hermes at haimaker.ai and get access to Claude alongside every other frontier model through one API key:

Base URL: https://api.haimaker.ai/v1
Model: anthropic/claude-3-haiku

Direct provider setup

Hermes makes custom endpoints easy. Run:

hermes model

Choose Custom endpoint from the menu. Enter the base URL and model identifier when prompted:

Base URL: https://api.haimaker.ai/v1
Model: anthropic/claude-3-haiku

Hermes stores the selection and uses it for all subsequent agent runs. You can also set HERMES_STREAM_READ_TIMEOUT and related env vars if you’re hitting slow providers.

Bottom line

For a standard Hermes Agent deployment, start with Claude 3.7 Sonnet for its reliability and reasoning. If you are just building a simple notification bot, Claude 3.5 Haiku will save you money without sacrificing much speed.

RUN CLAUDE IN HERMES WITH HAIMAKER

See our Hermes local-LLM setup guide.