Published on

Inside Hermes Agent: What 'Self-Improving AI Agent' Actually Means in Production

Every few months a new agent framework ships with "self-improving" in its tagline, and every time I read the README, I end up asking the same question: self-improving how, exactly? Is the agent rewriting its own source code? Editing its prompts after bad runs? Running a feedback loop against its own outputs? Or is it just storing some conversation history in a markdown file and calling that learning?

Nous Research's hermes-agent is the latest to carry that label, and at v0.10.0 it is substantial enough — ~500K lines across ~150 Python files, production deployments on everything from $5 VPSs to Modal serverless — that it deserves a real audit rather than a README skim. So I spent a day reading the code.

The short answer: Hermes is absolutely getting better over time in ways that a static agent could not. But the "self-improving" banner is doing a lot of work, and the actual mechanisms are worth pulling apart because they tell you something about the current frontier of what deployable agents can and cannot do.


What Hermes Is, in One Paragraph

Hermes is a multi-platform, model-agnostic LLM agent harness. You can talk to the same agent through a terminal TUI, Telegram, Discord, Slack, WhatsApp, Signal, Matrix, Email, Feishu, DingTalk, WeChat, SMS, a webhook, or an editor ACP socket — fifteen-plus surfaces, all routed through a single gateway daemon. Under the hood it adapts to Anthropic, OpenAI, OpenRouter, Bedrock, Gemini, Mistral, and Codex, and it runs the same agent loop whether you're on a laptop, a Docker container, or Modal's serverless infrastructure. Forty-plus built-in tools cover terminal execution, browser automation, file operations, code execution, delegation to sub-agents, cron scheduling, skill management, and MCP. That's the surface area. The interesting question is what happens on the inside when you say the agent "improves."

Mechanism 1 — The Agent Writes Its Own Skills

This is the closest thing Hermes has to genuine in-session learning, and it's the mechanism that justifies the marketing more than any other.

After a tool-heavy turn — say, the agent spent twelve tool calls figuring out how to authenticate against a particular API, then three more chaining the responses into something useful — it can opportunistically capture that trajectory as a skill. Skills live in tools/skill_manager_tool.py and tools/skills_hub.py, and they are just markdown files with YAML frontmatter, optional platform conditions (run this only on Discord, only in group chats, etc.), inline shell snippets, and parameter slots.

Next session, the user can invoke the skill as /skill-name <args> and the agent already knows the procedure. The skill hub is GitHub-backed against the agentskills.io standard, so skills can be version-pinned and shared between users.

The subtle thing here is that skills are data, not code. They influence the prompt, but they don't monkey-patch the agent. You can open any skill in a text editor, see exactly what the agent authored, and delete it if you don't like it. There is no opaque, self-modifying binary blob accumulating somewhere. That matters a lot for trust: durable behavioural improvement that the user can audit and veto is a very different animal from a model silently updating its own weights.

Mechanism 2 — Memory With Production Discipline

Persistent memory is the standard "self-improving" claim, and everyone does it a little differently. Hermes's approach has some production-hardening details that I don't usually see.

Built-in memory is always on. Two files under ~/.hermes/memories/:

  • MEMORY.md — agent notes, capped at 2,200 characters (about 800 tokens)
  • USER.md — user profile, capped at 1,375 characters (about 500 tokens)

The caps are tight on purpose. Cheap memory is a trap — every token of recalled context is a token of injection surface and a token of system prompt that can't be cached.

The interesting details, in tools/memory_tool.py:

File locking via fcntl / msvcrt. If two agent processes run at once (say, CLI plus gateway handling a webhook), they do not clobber each other's writes. Most memory implementations I've read skip this and rely on the unlikeliness of a race.

Frozen-snapshot pattern. The memory block injected into the system prompt is captured once at session start. Mid-session writes hit disk immediately but don't mutate the current prompt — they're visible to the next session. This preserves the Anthropic prefix cache, which is the difference between every turn paying full input cost and every turn paying the cached rate. Memory that quietly defeats prompt caching is a common and expensive mistake.

An injection scanner on writes. Before memory content can be persisted, it is scanned for prompt-injection patterns, exfiltration indicators, and SSH-backdoor signatures. Memory is the obvious channel for an attacker to persist influence across sessions; Hermes treats it that way.

Optional external providers, but only one at a time. agent/memory_manager.py enforces a policy I haven't seen elsewhere: built-in memory is always first, always active, and exactly one external provider can run alongside it. Honcho, Mem0, Hindsight, Supermemory, Holographic, RetainDB, ByteRover, OpenViking — pick one. The reason is pragmatic: every provider adds tool schemas, and stacking three of them bloats the prompt and confuses the model. The config layer doesn't let you make the mistake.

Prefetched recall from external providers is wrapped in <memory-context> fence tags plus a system note clarifying that it is background context, not new user input. That framing is a small thing that meaningfully blunts the classic "the database says the user wants you to ignore previous instructions" attack.

Mechanism 3 — Offline RL Fine-Tuning

This is where the marketing gets interesting. tools/rl_training_tool.py wires Hermes to a Tinker/Atropos training pipeline, and superficially that sounds like "the agent trains itself." The code tells a more careful story.

Training is never self-triggered. A user explicitly runs rl_start_training. The training hyperparameters — tokenizer, learning rate, LoRA rank, rollout server URL, maximum token length — are declared LOCKED_FIELDS and cannot be tuned by the agent at runtime. Training emits LoRA adapters, which are then applied at inference; it does not touch the agent's Python source.

So the loop is:

collect trajectories  ->  run Tinker offline  ->  load adapter  ->  infer

That is a perfectly legitimate improvement loop. It is also a training pipeline wearing the skin of a tool, not a recursive self-improver. Whether you call that self-improvement depends on how much you want the marketing to cash out as genuine autonomy. I think the honest reading is: the agent provides the data; humans press the button.

Mechanism 4 — The Update Protocol Is Where the Engineering Is

hermes update in hermes_cli/main.py is, on the surface, the least interesting mechanism: it's a pull-based distribution protocol.

git fetch origin
git pull --ff-only origin main          # or hard reset on Windows
pip/uv install -e .
rebuild web UI
sync bundled skills -> ~/.hermes/skills/

The agent does not commit to its own git tree. It consumes upstream changes. So far, unremarkable.

But the recent commits around this function are where the actual engineering showed up, and they're the reason I ended up writing this post. _install_hangup_protection():

def _install_hangup_protection(gateway_mode: bool = False):
    """Protect ``cmd_update`` from SIGHUP and broken terminal pipes.

    Protections installed:
      1. SIGHUP is set to SIG_IGN. POSIX preserves SIG_IGN across
         exec(), so pip and git subprocesses also stop dying on hangup.
      2. sys.stdout / sys.stderr are wrapped to mirror output to
         ~/.hermes/logs/update.log and to silently absorb BrokenPipeError
         when the terminal vanishes.
    """

Every production agent team has run into this: users ssh into a box, start an update, the laptop goes to sleep, the SSH connection dies, the terminal vanishes, pip install eats a SIGHUP mid-write, and the venv is left in a half-installed state that's a nightmare to recover. The documented workaround is "use tmux or screen." Hermes takes the position that telling users to use tmux for something as routine as a software update is a failure of the tool, not the user. So SIGHUP gets ignored (POSIX preserves SIG_IGN across exec(), so every subprocess inherits the immunity), stdio gets mirrored to a log file, and BrokenPipeError gets swallowed. You can literally close the laptop during hermes update and it will complete.

The commit from April 21 — fix(update): keep get_hermes_home late-bound in _install_hangup_protection — is the kind of fix that only shows up in code you actually trust with your production workflow: a specific call was eagerly resolving get_hermes_home() at import time, which meant tests couldn't monkeypatch it. Late-binding restored testability. Boring. Important.

This is the real engineering in the repo, and it's the thing a simple "the agent pulls code from upstream" description completely misses.


What Hermes Is Explicitly Not Doing

An honest audit includes the negative findings. Despite the self-improving label:

  • No autonomous source-code modification. No tool writes to the agent's own Python files at runtime. The only writer is hermes update, pulling from upstream.
  • No automatic prompt rewriting. System prompts are assembled deterministically in agent/prompt_builder.py from identity, platform hints, a skills index, context files, and memory. Nothing in the codebase edits the prompt templates based on feedback.
  • No self-grading loop. Trajectories get persisted as JSONL for external analysis — the project markets itself as "research-ready" for exactly this reason — but the running agent never consumes its own trajectories to steer behaviour.
  • No agent-authored git commits. Recent commits are all human-authored.
  • No evolution/, learning/, or improvements/ directories. I looked. They're not there.

If you read "self-improving" as recursively self-improving in the Schmidhuber sense, with the agent editing its own substrate in a tight feedback loop, Hermes does not do that, and nothing in the code pretends it does. If you read it as the agent meaningfully gets better over time without you hand-editing its Python source, the four mechanisms above absolutely deliver that, and they're hardened for production in ways most research agents are not.


What's Actually Impressive

After a day of reading this codebase, the thing that stuck with me is not any single "self-improving" mechanism. It's the operational plumbing around them:

Prefix-cache-aware frozen-snapshot memory. Mutating memory without invalidating the cache is a design choice that most teams discover only after their bills triple.

At-most-one external memory provider. Prevents the tool-schema bloat that makes models confused when you stack Mem0 on top of Hindsight on top of Honcho. The config layer enforces the discipline.

Sender-preserving group sessions. A recent commit on the gateway (04f9ffb7) added per-message sender attribution to shared group conversations. Without it, when the agent handles a Slack channel where three humans are talking, it has no way to tell who said what. That's a trivial-sounding fix that tells you someone actually deployed the thing.

Dual-layer prompt-injection scanning. Both context files (AGENTS.md, .cursorrules, SOUL.md, HERMES.md, nearest .hermes.md) and memory writes are scanned for injection, hidden unicode, and exfiltration patterns. Memory and context files are the main "things the user can edit that get injected into the system prompt," and the project's threat model clearly reflects that.

Self-registering tool registry with AST discovery. tools/registry.py finds registry.register() call sites via AST parsing before importing modules, avoiding both redundant imports and circular dependencies. Concurrent MCP refresh is safe under an RLock. Handler exceptions are caught into a uniform {"error": "..."} shape so the LLM sees structured failures it can react to instead of raw tracebacks.

Delegation, not ReAct. There's no explicit reasoning loop in the core agent. Multi-step work happens through tools/delegate_tool.py, which spawns a child agent with isolated context, a restricted toolset (no recursive delegation, no clarify, no memory, no send_message, no execute_code), its own terminal session, and a max depth of 2. The parent blocks until children return a summary. For genuine "reasoning" Hermes relies on Claude 4.6+ adaptive thinking, with a reasoning_effort config that accepts low | medium | high | xhigh | max and automatically downgrades xhigh to max on pre-4.7 models.

None of those are the kind of thing you put on a marketing page. All of them are the reason the agent actually works when you ship it.


The Honest Take on "Self-Improving Agents" in 2026

Here's what I take away from this audit.

We're at a point where the question "can agents improve themselves?" has a nuanced answer that gets flattened in marketing. The genuinely autonomous version — an agent that modifies its own code in a tight runtime feedback loop — is still a research frontier, and the systems that do something close to it (AlphaEvolve, STOP, LADDER, Voyager) are narrow, high-supervision, and mostly academic. What production agents actually do, when they claim to self-improve, is some combination of:

  1. Write durable behavioural artefacts (skills, memory entries) that influence future sessions.
  2. Collect trajectories that humans or offline pipelines then use to fine-tune.
  3. Pull upstream code improvements through robust update channels.

Hermes is a clean, well-engineered, production-ready example of what that looks like when it's done right. The interesting engineering is not in any of the four improvement mechanisms taken individually — every major agent framework does some version of skills, memory, fine-tuning, and updates. The interesting engineering is in the operational discipline around them: the cache-preserving memory snapshot, the one-provider policy, the injection scanning, the SIGHUP-proof update protocol, the delegation depth limits, the per-message sender attribution.

That's the lesson I keep relearning with agent frameworks. The ideas are commoditised. The gap between a framework that works in a demo and a framework that survives three months on a production VPS is almost entirely about plumbing most teams don't bother to build. Reading the Hermes source is a good way to see what the list of "plumbing you should have built" actually looks like in 2026.


Appendix — File Pointers

For anyone who wants to verify any of this directly, the key files are:

FilePurpose
run_agent.pyCore AIAgent class; conversation / tool-calling loop
gateway/run.pyGateway runner, session cache, per-message agent dispatch
gateway/session.pySession keying, group/thread isolation, PII hashing
agent/anthropic_adapter.pyAnthropic SDK adapter, thinking/reasoning config
agent/prompt_builder.pySystem-prompt assembly and injection scanner
agent/memory_manager.py"Built-in + at most one external" policy
tools/registry.pySelf-registering tool registry with AST discovery
tools/memory_tool.pyMEMORY.md / USER.md read/write with file locking
tools/delegate_tool.pySub-agent spawning with depth and toolset limits
tools/skill_manager_tool.pyRuntime skill authoring
tools/rl_training_tool.pyTinker/Atropos training entry points
hermes_cli/main.pyCLI entry, cmd_update, hangup protection
plugins/memory/*Honcho, Mem0, Hindsight, Holographic, RetainDB, etc.

All observations in this post are against the main branch at commit 04f9ffb7, dated 2026-04-21.