Current as of April 2026. DeepSeek has become the pragmatic choice for running Hermes Agent instances at scale. These models provide the tool-calling reliability needed for 47+ built-in tools while keeping operational costs low enough to run persistent, multi-platform agents on Discord or Slack 24/7.

The quick answer

ModelInput / OutputContextBest For
DeepSeek V3.1$0.15 / $0.7533KThe budget entry-point for stateless bots
DeepSeek V3.2$0.26 / $0.38164KThe primary choice for long-running autonomous workflows
DeepSeek V3$0.32 / $0.89164KThe budget entry-point for stateless bots
DeepSeek R1$0.70 / $2.5064KThe logic engine for multi-tool orchestration

Start with DeepSeek V3.2 unless you have a specific reason to pick another. It is the most balanced model for autonomous workflows. It provides a massive 164K context window for persistent memory and costs only $0.26/M input and $0.38/M output, making it cheaper to operate than the older V3 while handling longer agent sessions.

DeepSeek V3.1 — The budget entry-point for stateless bots

At $0.15/M input, this is the cheapest way to connect Hermes to a messaging platform. The 33K context limit is a significant bottleneck for agents using persistent cross-session memory, so reserve this for simple, ephemeral tasks where the agent doesn’t need to recall long conversation histories.

DeepSeek V3.2 — The primary choice for long-running autonomous workflows

This model is the sweet spot for Hermes. The 164K context window allows the agent to maintain deep memory across multiple messaging sessions. Its output pricing of $0.38/M is nearly half the cost of V3.1, making it more economical for agents that generate long, tool-heavy responses.

DeepSeek V3 — The budget entry-point for stateless bots

At $0.15/M input, this is the cheapest way to connect Hermes to a messaging platform. The 33K context limit is a significant bottleneck for agents using persistent cross-session memory, so reserve this for simple, ephemeral tasks where the agent doesn’t need to recall long conversation histories.

DeepSeek R1 — The logic engine for multi-tool orchestration

When Hermes needs to coordinate between multiple MCP servers or handle complex reasoning before acting, R1 is the only choice. It is the most expensive at $0.7/M input, but it avoids the logic loops that cheaper models fall into during long-running, autonomous tasks.

Setup in Hermes Agent

To integrate DeepSeek, run ‘hermes model’ and select ‘Custom endpoint’. Use your provider’s base URL (e.g., https://api.deepseek.com/v1) and enter the specific model identifier. Ensure your API key has sufficient credits, as DeepSeek’s low pricing often leads to high-volume usage in autonomous loops.

Running through haimaker.ai

Rather than standing up a per-provider account, you can point Hermes at haimaker.ai and get access to DeepSeek alongside every other frontier model through one API key:

  • Base URL: https://api.haimaker.ai/v1
  • Model: deepseek/deepseek-chat-v3.1

Direct provider setup

Hermes makes custom endpoints easy. Run:

hermes model

Choose Custom endpoint from the menu. Enter the base URL and model identifier when prompted:

  • Base URL: https://api.deepseek.com/v1
  • Model: deepseek/deepseek-chat-v3.1

Hermes stores the selection and uses it for all subsequent agent runs. You can also set HERMES_STREAM_READ_TIMEOUT and related env vars if you’re hitting slow providers.

Bottom line

For a production-ready Hermes Agent, use DeepSeek V3.2 for daily operations and swap to R1 only when the agent encounters complex reasoning tasks that require deep chain-of-thought processing.

RUN DEEPSEEK IN HERMES WITH HAIMAKER


See our Hermes local-LLM setup guide.