Current as of April 2026. The gpt-4-0314 snapshot is the original legacy powerhouse that established the standard for agentic reasoning, though it remains one of the most expensive options available.
Specs
| Provider | OpenAI |
| Input cost | $30 / M tokens |
| Output cost | $60 / M tokens |
| Context window | 8K tokens |
| Max output | 4K tokens |
| Parameters | N/A |
| Features | function_calling |
What it’s good at
Deterministic Instruction Following
It lacks the ‘laziness’ found in newer versions, making it highly reliable for strictly adhering to Hermes system prompts and identity constraints.
Reliable Tool Execution
The model handles function calling with high precision, rarely hallucinating arguments when interfacing with the 47 built-in Hermes tools.
Where it falls short
Extreme Cost
At $30 per million input and $60 per million output tokens, it is roughly 6x more expensive than GPT-4o for significantly less speed.
Restrictive Context Window
The 8K context limit is a severe bottleneck for autonomous agents that need to maintain long-term memory or process large MCP tool outputs.
Best use cases with Hermes Agent
- Critical Shell Automation — When running shell commands or managing Modal deployments, its lower hallucination rate justifies the premium price for safety.
- Persistent Persona Stability — It maintains a consistent character across multi-platform interactions on Telegram and Slack better than newer, more ‘aligned’ models.
Not ideal for
- High-Volume Messaging — The $60/1M output cost makes it financially unviable for a busy Discord or WhatsApp bot with hundreds of daily users.
- Context-Heavy MCP Tasks — Retrieving large amounts of data via MCP will hit the 8,192 token limit almost immediately, causing the agent to lose its state.
Hermes Agent setup
Explicitly set the model ID to ‘gpt-4-0314’ in your configuration; using the generic ‘gpt-4’ alias will often route to newer versions with different behavior.
Hermes makes custom endpoints easy. Run:
hermes model
Choose Custom endpoint from the menu. Enter the base URL and model identifier when prompted:
- Base URL:
https://api.haimaker.ai/v1 - Model:
openai/gpt-4-0314
Hermes stores the selection and uses it for all subsequent agent runs across whatever platforms you have wired up (Telegram, Discord, Slack, etc.). Tune HERMES_STREAM_READ_TIMEOUT and related env vars if you’re hitting slow providers.
How it compares
- vs GPT-4o — GPT-4o is significantly faster and cheaper ($5/$15), but 0314 is often more thorough with complex logic chains.
- vs Claude 3.5 Sonnet — Sonnet provides a massive 200K context window and better reasoning for a fraction of the cost, making it superior for most Hermes workflows.
Bottom line
A reliable legacy model for surgical precision in tool use, but the 8K context and massive price tag make it a niche choice for modern autonomous agents.
TRY GPT-4 (OLDER V0314) IN HERMES
For more, see our Hermes local-LLM setup guide.