Biologically-inspired persistent memory for AI agents. Automatically prunes stale data, reinforces useful context, and connects related memories through a graph layer.
Get started in 2 commands# recall("Python backend services")
── Round 1: vector search ──
── Round 2: graph expansion ──
✓ 1 stale fact pruned · 1 graph neighbour surfaced
Three independent benchmarks. Public datasets. Full methodology in BENCHMARKS.md.
Snap Research · 533 QA pairs · Full Stack
vector + graph + decay + resolve · 12 Apr 2026
Stack progression
YourMemory's multi-layer retrieval outperforms Supermemory's pure semantic search on 9 of 10 LoCoMo samples — with no cloud dependency.
The graph expansion layer adds +5 percentage points on top of semantic search and decay alone — measured on the same 533 QA pairs.
All retrieval, pruning, and graph expansion runs fully on your machine — no cloud inference cost, no data leaving your environment.
3-session developer workflow simulation — stateless baseline vs YourMemory
−84%
At 30 sessions. Memory block stays flat (~76–91 tokens) while stateless history grows O(n). At 3 sessions: −19.7% tokens, −28% per-session context.
−14%
Recalled context eliminates clarifying questions at the start of new sessions. Each clarifying round is a full LLM call that produces zero implementation output.
−4%
Memories below Ebbinghaus strength 0.05 are pruned from retrieval entirely. 3/15 memories pruned in a 60-day synthetic set. Compounds at scale (200+ memories).
Vector search finds what you asked for. The graph finds what you forgot to ask for. Ebbinghaus decides what survives.
Different kinds of memory age at different rates. Important facts persist longer; transient context fades naturally. Related memories stay alive together — no orphaned facts.
Two-round retrieval finds not just what you searched for, but what you forgot to search for. Related memories surface even when they don't share vocabulary with the query.
Multiple agents share context or keep secrets. API keys (ym_ prefix) authenticate each agent. Shared vs private visibility per memory.
Semantic search alone misses memories that are related but worded differently. A second retrieval pass surfaces them automatically.
Finds the most relevant memories for your query — fast and precise.
Related memories that didn't match the query directly are surfaced through the graph layer — nothing slips through.
Using a memory keeps its connected context cluster fresh automatically — the more you use it, the longer it survives.
Connected memory graph
Memories don't decay in isolation. Before a memory is pruned, its connected neighbours are checked — if any are still relevant, the whole cluster stays alive. Related facts age together.
Every time a memory is recalled, its connected neighbours get a freshness boost. The more a cluster of related memories is used, the longer the whole group persists — the system learns what matters to you.
Multiple AI agents share context or keep secrets. Each agent authenticates with an API key. You control exactly what each agent can read and write.
Each agent gets a unique ym_ API key. Shown once, never stored in plaintext. Revoke anytime.
Pass the API key in any MCP call. Set visibility to control who can see it.
Without a key → shared memories only. With a key → shared + that agent's private memories.
| Memory stored as | Owner agent | Other agents | No API key |
|---|---|---|---|
shared |
✓ | ✓ | ✓ |
private |
✓ | ✗ | ✗ |
Keys hashed with SHA-256 before storage. Revoke anytime with revoke_agent(agent_id, user_id).
yourmemory-setup automatically injects a curated instruction set into your agent's global context — telling it exactly when to recall, what to store, and how to prioritise memories. No manual configuration needed.
Recall policy — agent retrieves context before every task automatically
Store / update / ignore decision logic — no duplicate memories, no noise
Importance and category guidance — agent assigns decay rates and priority without being told
Written to ~/.claude/CLAUDE.md — applies globally across all your projects
[1/4] Downloading spaCy model…
✓ en_core_web_sm installed
[2/4] Initialising database…
✓ Database ready
[3/4] Writing MCP config…
✓ Claude Code → ~/.claude/settings.json
[4/4] Injecting memory rules…
✓ Memory rules → ~/.claude/CLAUDE.md
✓ Setup complete. Restart your AI client.
Install, run setup. That's it — spaCy model, database, and client configs are handled automatically.
$ pip install yourmemory
$ yourmemory-setup
Configures everything automatically — language model, database, and MCP config for every detected client on your machine.
✓ Language model ready
✓ Database initialised
✓ Claude Code → ~/.claude/settings.json
✓ Claude Desktop → auto-detected if installed
✓ Cursor / Windsurf / Cline → auto-detected if installed
✓ Memory rules → injected into global agent context
Restart your AI client after setup. YourMemory starts automatically as an MCP server — no background process to manage.
Works with
PostgreSQL (optional — teams / large datasets)
$ pip install yourmemory[postgres]
DATABASE_URL=postgresql://YOUR_USER@localhost:5432/yourmemory
Backend selected automatically from the connection string — no additional config required.
$ pip install 'yourmemory[neo4j]' $ GRAPH_BACKEND=neo4j yourmemory
Default graph runs fully in-process with zero setup. Switch to the production backend for large deployments via the GRAPH_BACKEND env var.