People searching “hermes codex” or “hermes agent codex” land in a slightly confusing spot. There are two products named in that query — Hermes Agent (Nous Research’s open-source CLI agent) and Codex (OpenAI’s CLI coding agent, formerly known as the GPT-5 Codex tool). They occupy similar terminal real estate but solve different problems.

This is a developer-honest comparison. Both have strengths. The right pick depends on what you’re doing.

The 30-second version

Hermes AgentCodex CLI
Made byNous Research (open source)OpenAI
Models300+ across any OpenAI-compatible providerOpenAI only (GPT-5, GPT-5 Codex variants)
Standout featureSelf-evolving skills, persistent memoryFirst-party OpenAI integration, low latency
Multi-platformCLI + Telegram, Slack, Discord, WhatsAppCLI only
Price (tool)Free (open source)Free (open source, paid model)
Price (model)Whatever you wire up — $0/day local to $100+/day frontierOpenAI rates only — typically $5-50/day
Best forLong-running personal agents, model flexibilityOpenAI-native shops, low-latency interactive use

Pick Hermes if you want model flexibility, persistent memory across sessions, or messaging platform integration. Pick Codex CLI if you’re OpenAI-native and value the first-party polish over portability.

What Hermes Agent actually does

Hermes is a CLI agent that learns the longer you run it. Two design choices make this concrete:

1. Persistent memory. Hermes maintains state across sessions in a structured store. Restart the terminal and it remembers your project layout, your conventions, what you were working on. Most other CLI agents start from zero each time.

2. Self-evolving skills (GEPA loop). Every 15 tool calls Hermes pauses, evaluates what it just did, and writes a “Skill Document” capturing what worked. The next time it sees a similar task, it pulls the skill instead of re-deriving the approach. The Nous team published a peer-reviewed result showing a ~40% speedup on repeat tasks after sufficient training.

Plus the practical features:

  • 300+ supported models through OpenAI-compatible endpoints. Anthropic, OpenAI, Google, xAI, DeepSeek, Haimaker, OpenRouter, local Ollama, vLLM. Mid-session swaps with hermes model.
  • Messaging gateways. A single Hermes instance can be reached from Telegram, Slack, Discord, WhatsApp, and the terminal. The same agent state and memory persist across all of them.
  • Sandboxed code execution. Tool calls run inside a Unix-socket RPC sandbox, not directly on your shell. Less risk of an rm -rf going wrong.
  • 47+ built-in tools — file ops, shell, web search, scheduled tasks, cross-platform messaging.

The trade-off: Hermes is fast-moving. Four major releases in three weeks during the March/April 2026 cycle means the surface keeps shifting. If you want a stable target, check that the tool is at least one version behind the bleeding edge before pinning.

For Hermes-specific model picks, see best Claude models for Hermes, best DeepSeek models for Hermes, and best Qwen models for Hermes.

What Codex CLI actually does

Codex is OpenAI’s terminal coding agent. It’s built around their model lineup — GPT-5, GPT-5 Codex, the o-series — and the integration is tight in a way third-party tools can’t match.

The pitch:

  • First-party polish. Latency is lower than going through OpenRouter or a custom endpoint. Tool schemas are tuned for OpenAI’s models specifically. Streaming behavior is smooth.
  • Local-first execution. Code reads and edits happen on your machine; Codex calls home for inference but doesn’t ship the whole codebase to OpenAI.
  • Strong defaults. No real config required to get started. Drop in your OpenAI API key and it works.

The trade-off: OpenAI only. You can’t switch to Claude when GPT-5 is being stubborn. You can’t switch to MiniMax when you want cheap volume. You can’t run a local model for sensitive code. Some teams accept this trade for the integration polish; others don’t.

If your stack is already OpenAI-native and you’re paying OpenAI rates anyway, Codex CLI is the lowest-friction option. If you want to mix models (most users should — see multi-model setup), Hermes or OpenClaw or OpenCode is the better fit.

Hermes Agent pricing in practice

Hermes itself is free (open source on GitHub). The cost is whatever model you point it at.

A few real cost points for a 24/7 always-on Hermes agent processing ~10M tokens/day:

ProviderDaily costNotes
Local Qwen3.6 27B via Ollama$0Hardware investment only
MiniMax M2.5 via Haimaker~$8Cheap default for most agent work
DeepSeek V3.2~$8Comparable to MiniMax, slightly different strengths
Claude Sonnet 4.6 direct~$110Frontier reasoning, expensive for high-volume
Haimaker auto-router~$15Mix of MiniMax + GPT-OSS + Claude per-request

The auto-router number is what most production Hermes setups land on. (Setup walkthrough — works the same with Hermes since both use OpenAI-compatible endpoints.)

For a developer using Hermes interactively (not 24/7), realistic costs are much lower — typically $1-5/day on cheap models, $5-20/day on Claude or GPT-5.

Codex CLI’s pricing is just OpenAI’s pricing. A typical interactive day on GPT-5 lands around $5-30. There’s no cheap-model option to fall back to.

Setting up Hermes with Claude (or any model)

Run:

hermes model

Pick Custom endpoint from the menu. Enter:

  • Base URL: https://api.haimaker.ai/v1 (or https://api.anthropic.com/v1, etc.)
  • Model: anthropic/claude-sonnet-4-6 (or whatever fully-qualified name your provider uses)

Hermes stores the choice and uses it for every subsequent run. Switch any time by running hermes model again. There’s no JSON config to edit, which is one of Hermes’s nicer features compared to OpenClaw.

Setting up Hermes through Haimaker gives you all 300+ models behind one API key, which is useful if you regularly switch between Claude, GPT, Gemini, and open-source models depending on the task.

Which one to pick

Use Hermes if:

  • You want to mix providers (cheap default + frontier fallback)
  • Persistent memory and self-evolving skills matter for your workflow
  • You want the agent reachable from Slack or Telegram, not just the terminal
  • You care about not being locked to one provider

Use Codex CLI if:

  • You’re already OpenAI-only and paying their rates
  • Latency and integration polish matter more than portability
  • You don’t need cross-platform messaging
  • You prefer a stable, slower-moving tool over a fast-iterating one

For most developers reading this in 2026, the model-flexibility argument tips toward Hermes. The cost difference of running a cheap default plus a frontier fallback is too large to leave on the table — and OpenAI doesn’t have a cheap option in the same league as MiniMax M2.5 or DeepSeek V3.2.

For OpenAI-native shops with existing Codex tooling, Codex CLI’s first-party polish is real. Don’t switch just because of model variety if you’re not actually going to use the variety.

RUN HERMES THROUGH HAIMAKER


Related: Best Claude models for Hermes, Best DeepSeek models for Hermes, Cheapest API for AI coding agents, OpenClaw multi-model setup.