~/mikita/projects/metalmind/README.md

side project

metalmind

Local-first persistent memory for Claude Code, built around an Obsidian vault. Decisions store, recall hits a loopback CLI (no MCP token tax), and the same vault works across every project.

Visit ↗

Year

2026 — Now

Role

Sole author

Team

Solo

Status

● Live · npm-published

90%

recall hit@1 (1k-note vault, hybrid + rerank)

87 ms

p95 recall latency

~519

cold-session tokens (vs mem0's ~1,319)

0

cloud accounts, daemons, or APIs to keep alive

01. The problem

Every claude invocation is a first meeting. Yesterday’s architectural decision, the reason a library was rejected, the 40-minute debug just finished — all gone by tomorrow. Native CLAUDE.md works for one repo; once notes cross repos, it stops scaling.

Most “memory for AI” tools fix this by registering as MCP servers. That design injects a handful of tool schemas into every Claude session before you’ve prompted anything — every project pays the standing token tax, whether the session needs memory or not.

02. Approach

  • One vault, every project. project: frontmatter and a MOC per project. A decision written in repo A surfaces in repo B if it’s topically relevant. Plain markdown in ~/Knowledge/ — Obsidian still opens the files; grep still searches them; git still versions them.
  • CLI, not MCP. Recall is a Bash call to a co-hosted loopback HTTP server (127.0.0.1:17317) inside the vault watcher process. Zero standing tool schemas — Claude learns the command once from a stamped block in ~/.claude/CLAUDE.md.
  • Hybrid retrieval, deterministic. Semantic (sqlite-vec + BAAI/bge-small-en-v1.5 via fastembed) fused with keyword (FTS5) via RRF. No LLM mediating writes, no fact-extraction layer between you and your notes. Optional cross-encoder rescore for the cases that need it.
  • Walk-away cost zero. metalmind uninstall stops the watcher, restores prior state, clears aliases. The vault is never touched.

Most memory tools spend tokens to find your note. metalmind spends none, then finds it anyway.

03. Outcome

Shipping on npm under metalmind. Used daily across multiple repos as the default Claude Code memory layer.

Measured against the same 1k-note retrieval fixture: 90% hit@1 with hybrid + rerank, p95 latency ~87 ms. Cold-session token cost is roughly 2.5× lower than mem0 as shipped, 8.4× lower on the apples-to-apples MCP comparison (same transport, different schema discipline). Both benchmarks reproducible from the repo.

The forge layer (cross-repo code graph) lands HTTP-route edges across services with provenance tags — every inferred edge carries INFERRED_NAME / INFERRED_ROUTE / INFERRED_URL_LITERAL so Claude can trust-grade what it reads.

Stack

TypeScriptNode.jssqlite-vecfastembedFTS5ObsidianBash CLI