top | item 47208019

Show HN: Hmem v2 – Persistent hierarchical memory for AI agents (MCP)

2 points| Bumblebiber | 1 day ago |github.com

Last week I posted hmem here and got some good feedback. I've been heads-down since then and v2 is out.

Quick recap of the idea: AI coding agents forget everything between sessions. Worse, if you switch machines or tools, even in-session memory is gone. Hmem fixes this with a 5-level hierarchy — agents load only L1 summaries on startup (~20 tokens), then drill deeper on demand. Like how you remember "I was in Paris once" before you recall the specific café.

What's new in v2: The tree structure is now properly addressable. Every node gets a compound ID (L0003.2.1), so you can update or append to any branch without touching siblings. update_memory and append_memory work in-place — no delete-and-recreate.

Obsolete entries are never deleted, just archived. They stay searchable and teach future agents what not to do. A summary line shows what's hidden.

Access-count promotion with logarithmic age decay. Frequently-used entries surface automatically — but newer entries aren't buried just because older ones have more history.

Session cache with Fibonacci decay. Bulk reads suppress already-seen entries so you don't get the same context dumped every call. Two modes: discover (newest-heavy, good for session start) and essentials (importance-heavy, kicks in after context compression).

TUI viewer for browsing .hmem files — mirrors exactly what the agent sees at session start, including all markers and scoring.

Curator role — a dedicated agent that runs periodically, audits all memory files, merges duplicates, marks stale entries, prunes low-value content. Also accesible via skill "hmem-self-curate". Still MIT, still npx hmem-mcp init. GitHub: https://github.com/Bumblebiber/hmem

1 comment

order

Bumblebiber|1 day ago

Some core features worth calling out:

  5-level lazy loading instead of flat RAG. On spawn, agents receive only L1 — one-line summaries of every memory entry.
   When something looks relevant, they drill into L2, L3, etc. on demand. This keeps startup context small (~2k tokens
  for 200 entries) while still having full verbatim detail available at L5.

  Tab indentation maps directly to tree levels. No schema to think about — just write:

  write_memory(prefix="L", content="Always restart MCP server after recompiling TS
      Running process holds old dist — tool calls return stale results
      Fix: kill $(pgrep -f mcp-server)")

  Indented lines become children. Siblings at the same depth are siblings in the tree. append_memory adds children to
  any existing node without overwriting.

  Smart bulk-read selection. Not all L1 entries are shown equally. The V2 selection expands (shows children) for: newest
   N entries per prefix, most-accessed M entries (time-weighted score), and all favorites. Everything else shows as a
  compact title with a [+7 →] hint. This mirrors how human working memory works — recent and frequently-used things are
  immediately accessible.

  Obsolete, not deleted. When a lesson turns out wrong or a decision gets reversed, you mark it obsolete with a
  correction reference ([E0076]). The old entry gets hidden from bulk reads but stays searchable. Past mistakes still
  teach — you just don't pay tokens for them every session.

  Per-agent isolation + optional shared store. Each agent has its own .hmem SQLite file. There's also a company.hmem
  with role-based access control (worker/al/pl/ceo) for knowledge that should be shared across agents.

  The whole thing is a single MCP server, ~3k lines of TypeScript, backed by SQLite with WAL mode. No vector embeddings,
   no external services. Works offline.
OpenClaw users: you can integrate hmem directly by pointing your Claw at the GitHub repo — it'll figure out the rest. npx hmem-mcp init gets you running in under a minute.