Hermes Agent: The Desk That Remembers

12 min read Tiếng Việt
Featured image for NousResearch/hermes-agent — Hermes Agent: The Desk That Remembers

TL;DR

  • What it solves: Most AI agents reset context every session, forcing you to re-explain your stack, your preferences, and your ongoing work. Hermes Agent reads its own memory files at startup so the workbench is already set.
  • Why it matters: Without persistent context, every session is a fresh hire. With it, the agent compounds: preferences carry forward, skills self-improve, and recurring tasks shrink to a single slash command.
  • Best for: Developers, ops engineers, and researchers who want a long-term collaborator that lives in the terminal and follows them to their phone, not one that resets when the tab closes.
  • Main differentiator: The skills system is what separates it from other memory agents. After complex tasks, the agent auto-creates reusable skill files that self-improve on each run and can be published on agentskills.io.
  • Best use case: Returning to a multi-week project and finding the agent already briefed: the branch name, the library preferences, the current blocker, all present before you type the first word.

I asked my AI agent to help me with a Rust project. It did a good job. I came back the next day, asked it for a small update, and was met with a blank stare.

“What Rust project?” it seemed to ask.

I had to re-explain the stack, the file structure, my preference for sqlx over diesel. Ten minutes gone before I could write a single line of code. It felt like training a new intern every single morning.

The problem with most agents is not intelligence. It is the complete absence of any space to persist what they learn. Every session is a clean room. A blank whiteboard. They cannot remember your conventions, your current project, or the fact that you have asked about cargo build three hundred times.

Think of two woodworkers. One walks into a bare room every morning: no tools on the wall, no half-finished project on the bench, no measurements written in marker. The other opens the same door every day and finds everything exactly where they left it, with notes in the margin of the last cutting. One is a contractor. The other has a workbench.

Hermes Agent is the workbench.

It is a Python-based TUI and gateway framework built by Nous Research that wraps any LLM and adds a persistent learning loop. NousResearch/hermes-agent does one thing no stateless chat wrapper does: it keeps a curated, self-managed memory that the agent reads and edits across every session. Not just memory, as you will see. But that is where the story starts.

Real-World Use Cases

The use cases get more interesting the further you are from your desk.

  1. Developer re-entering a project: You closed the terminal Friday afternoon. Monday morning, type hermes. It already knows your Rust project, your sqlx preference, and the feat/auth branch status. No re-onboarding required.

  2. Mobile debugging via Telegram: A server alarm fires at 2am. You are not at your desk. You message your Hermes bot on Telegram: “What is the load on prod right now?” The agent runs the check on your SSH backend, pulls the metrics, and replies with a summary. All using the same memory it built in your terminal that morning.

  3. Cron-driven team briefing: Hermes is scheduled for 8am every weekday. It fetches new GitHub issues, last night’s CI results, and deploy status, then delivers a Slack message. You described this in natural language and never wired any integrations.

  4. Skill reuse across projects: After a complex Django setup, Hermes builds a setup-django skill. The next Flask project does not start from scratch. The agent builds on a procedure it already ran and verified.

  5. Serverless ML dev box: Configure the Modal backend. Your environment lives on cloud GPU infrastructure, hibernates when you close the laptop, and wakes in seconds on demand. The cost between active sessions is nearly nothing. The agent lives on a cloud shelf, not a machine you have to maintain.

  6. Skill sharing: After Hermes builds a Docker deployment skill from your workflow, run /skills and push it to agentskills.io. Another developer downloads it and their agent already knows your exact pattern.

  7. Android remote ops: Termux on your phone, Hermes connected to a home server via SSH backend. Full terminal agent in your pocket with no server to provision.

The through-line is that the agent follows you, not the other way around. But before it can follow you anywhere, it needs somewhere to store what it learns.

How to Use It

The core loop is simpler than it looks.

# Install (Linux, macOS, WSL2, Android Termux)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
source ~/.bashrc

# First run: pick a model provider and configure tools
hermes setup

# Every session after that: just start
hermes

# Check for configuration problems
hermes doctor

# Update without touching memory files
hermes update

The moment you run hermes, the agent reads ~/.hermes/memories/MEMORY.md, USER.md, and SOUL.md, then injects them into the system prompt. There are no flags for this. Sessions continue automatically because persistence is the default, not an option.

Here is what the memory loop looks like in practice.

Before (stateless agent):

You: “I prefer sqlx over diesel for Rust database work.” Agent: “Understood!” (forgets by next session)

After (Hermes session):

You: “I prefer sqlx over diesel for Rust database work.” Agent updates USER.md: entry DB preference: sqlx (not diesel) Next session system prompt includes that entry. You never say it again.

The memory budget is intentionally tight, around 2,200 characters per file. The agent decides what to keep and what to compress, like human long-term memory: it forgets the trivial to save the vital. The three files in ~/.hermes/memories/ each serve a different purpose, and the one people discover last is the most interesting. But that comes up in the getting-started section. First, the feature nobody talks about.

💡 Tip: Drop an AGENTS.md file in any project directory. Hermes reads it for project-scoped context on top of global memory. Think of it as a per-project briefing document: it adds context without overwriting anything.

But memory alone is not what makes Hermes unusual. The part most people miss entirely is what happens after the session ends.

The Skills System

After Hermes completes a sufficiently complex task, it does something most agents skip: it writes down the recipe.

The agent automatically creates a skill file for what it just did. A structured, reusable procedure scoped to that task class. You ask it to set up a Django project. It does the work, then creates a setup-django skill. Next time you start a Flask project, the agent does not start from scratch. It builds on patterns from a procedure it already ran, self-corrected, and stored.

Skills grow. Each invocation gives the agent another chance to refine the procedure based on what actually worked. It is like giving a developer repetitions on real production work instead of toy exercises.

hermes        # inside any session, type / to see available skills
/skills       # list, view, share, or delete individual skills

Shared skills follow the agentskills.io open standard. You publish a skill from your workflow and anyone running Hermes can download it. A Kubernetes deployment skill from an SRE in Berlin. A data pipeline skill from a researcher in Tokyo. The skills hub is young. But the breadcrumb is there: the agent that teaches itself can eventually teach everyone.

Where It Fits (And Where It Doesn’t)

Hermes sits in a specific part of the landscape. It is for people who live in the terminal, need context to survive across days and projects, and want the agent to come to them rather than waiting in a browser tab.

The gateway makes that real: one agent reachable on Telegram, Discord, Slack, WhatsApp, Signal, Email, or Home Assistant. All seven platforms share the same memory and skill set. The agent does not become a different agent when you switch from terminal to phone.

Terminal backends (six options):

BackendBest forCost model
LocalYour dev machineYour hardware
DockerIsolated, reproducible runsYour hardware
SSHRemote servers or VPSYour server
DaytonaServerless, hibernates when idleNear $0 between sessions
SingularityHPC and academic clustersYour cluster
ModalServerless GPU for ML tasksNear $0 between sessions

Daytona and Modal are worth singling out. Your agent’s entire environment, including memory files and skills, lives on cloud infrastructure. Between sessions it hibernates. You pay for compute only while it runs. For an always-available bot on a developer’s salary, that is actually affordable.

Where Hermes does not fit well: teams wanting a shared multi-user agent (it is single-user by design), workflows requiring browser automation (use a browser-focused tool instead), and Windows users not on WSL2 (native Windows is unsupported). The model freedom genuinely helps: switch any time with hermes model, no code changes, no config migration. What the skills system cannot yet solve is portability of context across teammates, which is a harder problem than it sounds.

The Rough Edges

⚠️ Warning: The ~/.hermes/memories/ folder contains your entire agent state: MEMORY.md, USER.md, and the SOUL.md persona file all live there. If you get a new machine and forget to migrate this folder, you lose all learned context and preferences. There is no automatic cloud sync. Treat it like your .ssh directory: back it up before it matters.

A few more honest notes from the README:

  • Messaging gateway setup requires managing API keys and bot tokens per platform. Telegram is straightforward. Signal involves more steps.
  • Memory budget is per-session. Very long project histories require the agent to summarize and compress, which occasionally loses specific detail.
  • Android Termux users must install with the .[termux] extra, not .[all]. The voice dependency in the full installation conflicts with Termux’s environment.
  • The RL training integration (Atropos) requires a separate git submodule init step that the main install script does not cover.
  • Migration from OpenClaw (the predecessor tool) is automated: hermes claw migrate handles SOUL.md, memories, skills, API keys, and messaging settings.

These are friction points, not blockers. The worst one (the portability issue) has a simple answer, but requires you to think about it before you need it.

Getting Started

The minimum path to a real working result:

  1. Install: curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
  2. Load shell: source ~/.bashrc (or ~/.zshrc)
  3. Run the wizard: hermes setup (choose provider and model)
  4. Start the TUI: hermes

From there, useful onboarding is just use. Tell it about your environment. Describe your current project. Tell it how you like your code styled. Then close it. Open it tomorrow. The workbench will be waiting.

The memory directory holds three files:

~/.hermes/memories/
├── MEMORY.md    # environment facts, project conventions, tool preferences
├── USER.md      # technical level, working style, preferences
└── SOUL.md      # persona file: the agent's character and tone

SOUL.md is the one people discover last and appreciate most. It controls how the agent communicates: its verbosity, its tone, whether it adds commentary or just acts. Edit it once and the behavioral fingerprint is baked into every subsequent session. Most people spend the first week with someone else’s personality and do not notice until they go looking.

FAQ

Does Hermes Agent work with local models?

Yes. hermes model lets you point the agent at any OpenAI-compatible endpoint, including Ollama and LocalAI. Nous Portal, OpenRouter (200+ models), z.ai/GLM, Kimi/Moonshot, MiniMax, OpenAI, and Anthropic are all supported out of the box.

What happens if my memory files get too large?

The agent enforces a character budget around 2,200 characters per file. When a session would push memory over that limit, it compresses older entries and prioritizes recent and frequently relevant information.

Can I use Hermes on my phone?

On Android via Termux, yes. Install with the .[termux] extra instead of .[all] to avoid the voice dependency conflict. iOS has no supported path yet.

How is the skills system different from just saving prompts?

A saved prompt is static text. A Hermes skill is an executable procedure that the agent ran, verified, and can update. When the skill runs on a new project variant, the agent refines it based on what actually worked that time.

What is SOUL.md for, specifically?

SOUL.md defines the agent’s persona: its tone, verbosity level, whether it explains its reasoning or just acts, and how it handles uncertainty. It is separate from MEMORY.md (what the agent knows) and USER.md (what you prefer). Edit it once to carry a behavioral fingerprint across every session.

Final Thoughts

I came back to the Rust project on Monday. The workbench was exactly where I left it. The sqlx preference was in USER.md. The feat/auth branch context was in MEMORY.md. The setup-rust-api skill I had built over the previous week was two slash commands away.

The intern problem does not go away on its own. It takes architecture: persistent state, a skills layer that compounds, and a deployment model that puts the agent where you already are. That is what 63,168 stars are paying attention to. The cloud-sync gap and Windows support still need work. But for a terminal-native developer who wants an agent that builds on itself instead of starting over, this is the clearest path I have found.


NousResearch/hermes-agent · MIT · 63168★ · docs

Hoang Yell

Hoang Yell

A software developer and technical storyteller. I spend my time exploring the most interesting open-source repositories on GitHub and presenting them as accessible stories for everyone.