MCP Compatible
Python 3.11 – 3.14
v1.3.0 — Graph Engine

Memory that
ages gracefully.

Biologically-inspired persistent memory for AI agents. Automatically prunes stale data, reinforces useful context, and connects related memories through a graph layer.

Get started in 2 commands
yourmemory · recall

# recall("Python backend services")

── Round 1: vector search ──

"Sachit uses Python at MongoDB" sim 0.61

── Round 2: graph expansion ──

"Docker + K8s production deploys" via graph
"Uses React" 0.04 (Decayed)

✓ 1 stale fact pruned · 1 graph neighbour surfaced

Data-proven
superiority.

Three independent benchmarks. Public datasets. Full methodology in BENCHMARKS.md.

LoCoMo Recall@5

Snap Research · 533 QA pairs · Full Stack

vector + graph + decay + resolve · 12 Apr 2026

YourMemory full stack 52%
Supermemory 28%
Zep Cloud 22%
Mem0 18%

Stack progression

Vector + decay only47%
+ graph expansion52% (+5pp)
  • +24pp vs Supermemory (86% relative)

    YourMemory's multi-layer retrieval outperforms Supermemory's pure semantic search on 9 of 10 LoCoMo samples — with no cloud dependency.

  • Graph layer: +5pp alone

    The graph expansion layer adds +5 percentage points on top of semantic search and decay alone — measured on the same 533 QA pairs.

  • Zero LLM calls for retrieval

    All retrieval, pruning, and graph expansion runs fully on your machine — no cloud inference cost, no data leaving your environment.

Workflow Efficiency

3-session developer workflow simulation — stateless baseline vs YourMemory

−84%

Token savings

At 30 sessions. Memory block stays flat (~76–91 tokens) while stateless history grows O(n). At 3 sessions: −19.7% tokens, −28% per-session context.

3 sessions−19.7%
30 sessions−84.1%
Stale tokens−100%

−14%

Fewer LLM calls

Recalled context eliminates clarifying questions at the start of new sessions. Each clarifying round is a full LLM call that produces zero implementation output.

Session 10 saved
Session 2−1 clarify call
Session 3+−1 clarify call

−4%

Context pruning

Memories below Ebbinghaus strength 0.05 are pruned from retrieval entirely. 3/15 memories pruned in a 60-day synthetic set. Compounds at scale (200+ memories).

Pruned memories20%
Top-5 tokens74 → 71
No stale factsinjected

Three layers. One engine.

Vector search finds what you asked for. The graph finds what you forgot to ask for. Ebbinghaus decides what survives.

Biologically Pruned

Different kinds of memory age at different rates. Important facts persist longer; transient context fades naturally. Related memories stay alive together — no orphaned facts.

Hybrid Graph + Vector

Two-round retrieval finds not just what you searched for, but what you forgot to search for. Related memories surface even when they don't share vocabulary with the query.

Multi-Agent Memory

Multiple agents share context or keep secrets. API keys (ym_ prefix) authenticate each agent. Shared vs private visibility per memory.

New in v1.3.0

Smarter retrieval.

Semantic search alone misses memories that are related but worded differently. A second retrieval pass surfaces them automatically.

1

Semantic search

Finds the most relevant memories for your query — fast and precise.

2

Context expansion

Related memories that didn't match the query directly are surfaced through the graph layer — nothing slips through.

Recall propagation

Using a memory keeps its connected context cluster fresh automatically — the more you use it, the longer it survives.

Connected memory graph

w=0.34 w=0.29 Python MongoDB DuckDB spaCy Docker K8s dark mode connected isolated

Chain-aware pruning

Memories don't decay in isolation. Before a memory is pruned, its connected neighbours are checked — if any are still relevant, the whole cluster stays alive. Related facts age together.

Recall propagation

Every time a memory is recalled, its connected neighbours get a freshness boost. The more a cluster of related memories is used, the longer the whole group persists — the system learns what matters to you.

Multi-agent shared memory.

Multiple AI agents share context or keep secrets. Each agent authenticates with an API key. You control exactly what each agent can read and write.

1

Register an agent

Each agent gets a unique ym_ API key. Shown once, never stored in plaintext. Revoke anytime.

result = register_agent(
  agent_id="coding-agent",
  user_id="sachit",
)
# → ym_xxxx (save once)
2

Store shared or private

Pass the API key in any MCP call. Set visibility to control who can see it.

# shared — all agents see this
store_memory(
  content="DB is Postgres 16",
  api_key="ym_xxxx",
  visibility="shared"
)

# private — only this agent
store_memory(
  content="staging key sk-xxx",
  visibility="private"
)
3

Recall with scope

Without a key → shared memories only. With a key → shared + that agent's private memories.

# coding-agent recalls
recall_memory(
  query="database production",
  api_key="ym_xxxx"
)
# ← shared + private memories

# review-agent (different key)
recall_memory(query="database")
# ← shared memories only

Visibility matrix

Memory stored as Owner agent Other agents No API key
shared
private

Keys hashed with SHA-256 before storage. Revoke anytime with revoke_agent(agent_id, user_id).

Auto-configured

Agent memory rules,
baked in.

yourmemory-setup automatically injects a curated instruction set into your agent's global context — telling it exactly when to recall, what to store, and how to prioritise memories. No manual configuration needed.

  • Recall policy — agent retrieves context before every task automatically

  • Store / update / ignore decision logic — no duplicate memories, no noise

  • Importance and category guidance — agent assigns decay rates and priority without being told

  • Written to ~/.claude/CLAUDE.md — applies globally across all your projects

yourmemory-setup

[1/4] Downloading spaCy model…

en_core_web_sm installed

[2/4] Initialising database…

Database ready

[3/4] Writing MCP config…

Claude Code → ~/.claude/settings.json

[4/4] Injecting memory rules…

Memory rules → ~/.claude/CLAUDE.md

✓ Setup complete. Restart your AI client.

Two commands.

Install, run setup. That's it — spaCy model, database, and client configs are handled automatically.

1. Install
$ pip install yourmemory
2. Setup (run once)
$ yourmemory-setup

Configures everything automatically — language model, database, and MCP config for every detected client on your machine.

Language model ready

Database initialised

Claude Code → ~/.claude/settings.json

Claude Desktop → auto-detected if installed

Cursor / Windsurf / Cline → auto-detected if installed

Memory rules → injected into global agent context

Restart your AI client after setup. YourMemory starts automatically as an MCP server — no background process to manage.

Works with

Claude Code Claude Desktop Cline Cursor Windsurf Continue Zed

PostgreSQL (optional — teams / large datasets)

Install with Postgres support
$ pip install yourmemory[postgres]
Create a .env file
DATABASE_URL=postgresql://YOUR_USER@localhost:5432/yourmemory

Backend selected automatically from the connection string — no additional config required.

Graph backend — production scale (opt-in)
$ pip install 'yourmemory[neo4j]'
$ GRAPH_BACKEND=neo4j yourmemory

Default graph runs fully in-process with zero setup. Switch to the production backend for large deployments via the GRAPH_BACKEND env var.