What's the difference between Hermes Agent and Codex CLI?

Hermes Agent (Nous Research) is a self-improving CLI agent with persistent memory, automated skill creation, and 300+ supported models across providers. Codex CLI (OpenAI) is OpenAI's first-party local coding agent — tightly integrated with GPT-5 and OpenAI's tooling, but locked to OpenAI models. Hermes also ships messaging gateways (Telegram, Slack, Discord); Codex CLI is terminal-only.

How much does Hermes Agent cost?

Hermes itself is open source and free. You pay for the underlying model. With cheap providers like MiniMax M2.5 ($0.30/M) or DeepSeek V3.2 (~$0.27/M) routed through Haimaker, daily costs land around $1-5. With Claude Sonnet or GPT-5.4, expect $20-100/day on heavy use.

Can I use Claude with Hermes Agent?

Yes. Hermes is OpenAI-compatible and supports any provider that exposes a /v1/chat/completions endpoint. Run 'hermes model', pick Custom endpoint, and point at Anthropic, Haimaker, OpenRouter, or any other provider. Claude Sonnet 4.6 and Opus 4.5 are popular Hermes choices.

Hermes Agent vs Codex CLI: Which Coding Agent to Use (2026)

People searching “hermes codex” or “hermes agent codex” land in a slightly confusing spot. There are two products named in that query — Hermes Agent (Nous Research’s open-source CLI agent) and Codex (OpenAI’s CLI coding agent, formerly known as the GPT-5 Codex tool). They occupy similar terminal real estate but solve different problems.

This is a developer-honest comparison. Both have strengths. The right pick depends on what you’re doing.

The 30-second version

	Hermes Agent	Codex CLI
Made by	Nous Research (open source)	OpenAI
Models	300+ across any OpenAI-compatible provider	OpenAI only (GPT-5, GPT-5 Codex variants)
Standout feature	Self-evolving skills, persistent memory	First-party OpenAI integration, low latency
Multi-platform	CLI + Telegram, Slack, Discord, WhatsApp	CLI only
Price (tool)	Free (open source)	Free (open source, paid model)
Price (model)	Whatever you wire up — $0/day local to $100+/day frontier	OpenAI rates only — typically $5-50/day
Best for	Long-running personal agents, model flexibility	OpenAI-native shops, low-latency interactive use

Pick Hermes if you want model flexibility, persistent memory across sessions, or messaging platform integration. Pick Codex CLI if you’re OpenAI-native and value the first-party polish over portability.

What Hermes Agent actually does

Hermes is a CLI agent that learns the longer you run it. Two design choices make this concrete:

1. Persistent memory. Hermes maintains state across sessions in a structured store. Restart the terminal and it remembers your project layout, your conventions, what you were working on. Most other CLI agents start from zero each time.

2. Self-evolving skills (GEPA loop). Every 15 tool calls Hermes pauses, evaluates what it just did, and writes a “Skill Document” capturing what worked. The next time it sees a similar task, it pulls the skill instead of re-deriving the approach. The Nous team published a peer-reviewed result showing a ~40% speedup on repeat tasks after sufficient training.

Plus the practical features:

300+ supported models through OpenAI-compatible endpoints. Anthropic, OpenAI, Google, xAI, DeepSeek, Haimaker, OpenRouter, local Ollama, vLLM. Mid-session swaps with hermes model.
Messaging gateways. A single Hermes instance can be reached from Telegram, Slack, Discord, WhatsApp, and the terminal. The same agent state and memory persist across all of them.
Sandboxed code execution. Tool calls run inside a Unix-socket RPC sandbox, not directly on your shell. Less risk of an rm -rf going wrong.
47+ built-in tools — file ops, shell, web search, scheduled tasks, cross-platform messaging.

The trade-off: Hermes is fast-moving. Four major releases in three weeks during the March/April 2026 cycle means the surface keeps shifting. If you want a stable target, check that the tool is at least one version behind the bleeding edge before pinning.

For Hermes-specific model picks, see best Claude models for Hermes, best DeepSeek models for Hermes, and best Qwen models for Hermes.

What Codex CLI actually does

Codex is OpenAI’s terminal coding agent. It’s built around their model lineup — GPT-5, GPT-5 Codex, the o-series — and the integration is tight in a way third-party tools can’t match.

The pitch:

First-party polish. Latency is lower than going through OpenRouter or a custom endpoint. Tool schemas are tuned for OpenAI’s models specifically. Streaming behavior is smooth.
Local-first execution. Code reads and edits happen on your machine; Codex calls home for inference but doesn’t ship the whole codebase to OpenAI.
Strong defaults. No real config required to get started. Drop in your OpenAI API key and it works.

The trade-off: OpenAI only. You can’t switch to Claude when GPT-5 is being stubborn. You can’t switch to MiniMax when you want cheap volume. You can’t run a local model for sensitive code. Some teams accept this trade for the integration polish; others don’t.

If your stack is already OpenAI-native and you’re paying OpenAI rates anyway, Codex CLI is the lowest-friction option. If you want to mix models (most users should — see multi-model setup), Hermes or OpenClaw or OpenCode is the better fit.

Hermes Agent pricing in practice

Hermes itself is free (open source on GitHub). The cost is whatever model you point it at.

A few real cost points for a 24/7 always-on Hermes agent processing ~10M tokens/day:

Provider	Daily cost	Notes
Local Qwen3.6 27B via Ollama	$0	Hardware investment only
MiniMax M2.5 via Haimaker	~$8	Cheap default for most agent work
DeepSeek V3.2	~$8	Comparable to MiniMax, slightly different strengths
Claude Sonnet 4.6 direct	~$110	Frontier reasoning, expensive for high-volume
Haimaker auto-router	~$15	Mix of MiniMax + GPT-OSS + Claude per-request

The auto-router number is what most production Hermes setups land on. (Setup walkthrough — works the same with Hermes since both use OpenAI-compatible endpoints.)

For a developer using Hermes interactively (not 24/7), realistic costs are much lower — typically $1-5/day on cheap models, $5-20/day on Claude or GPT-5.

Codex CLI’s pricing is just OpenAI’s pricing. A typical interactive day on GPT-5 lands around $5-30. There’s no cheap-model option to fall back to.

Setting up Hermes with Claude (or any model)

Run:

hermes model

Pick Custom endpoint from the menu. Enter:

Base URL: https://api.haimaker.ai/v1 (or https://api.anthropic.com/v1, etc.)
Model: anthropic/claude-sonnet-4-6 (or whatever fully-qualified name your provider uses)

Hermes stores the choice and uses it for every subsequent run. Switch any time by running hermes model again. There’s no JSON config to edit, which is one of Hermes’s nicer features compared to OpenClaw.

Setting up Hermes through Haimaker gives you all 300+ models behind one API key, which is useful if you regularly switch between Claude, GPT, Gemini, and open-source models depending on the task.

Which one to pick

Use Hermes if:

You want to mix providers (cheap default + frontier fallback)
Persistent memory and self-evolving skills matter for your workflow
You want the agent reachable from Slack or Telegram, not just the terminal
You care about not being locked to one provider

Use Codex CLI if:

You’re already OpenAI-only and paying their rates
Latency and integration polish matter more than portability
You don’t need cross-platform messaging
You prefer a stable, slower-moving tool over a fast-iterating one

For most developers reading this in 2026, the model-flexibility argument tips toward Hermes. The cost difference of running a cheap default plus a frontier fallback is too large to leave on the table — and OpenAI doesn’t have a cheap option in the same league as MiniMax M2.5 or DeepSeek V3.2.

For OpenAI-native shops with existing Codex tooling, Codex CLI’s first-party polish is real. Don’t switch just because of model variety if you’re not actually going to use the variety.

RUN HERMES THROUGH HAIMAKER