Server data from the Official MCP Registry
Semantic memory for AI agents with hybrid search, knowledge graph, and consolidation
Semantic memory for AI agents with hybrid search, knowledge graph, and consolidation
Valid MCP server (1 strong, 1 medium validity signals). 2 known CVEs in dependencies (0 critical, 1 high severity) Package registry verified. Imported from the Official MCP Registry.
5 files analyzed · 3 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-adelelo13-neuromcp": {
"args": [
"-y",
"neuromcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
Local-first MCP memory server with closed-loop attribution critic. Oracle-split and distractor-split benchmark numbers both published, with sample-size caveats.
Local-first MCP server with hybrid search, verbatim recall, and crash-resilient session persistence.
npx neuromcp
| Mode | R@5 | R@10 | Hit Rate |
|---|---|---|---|
| Extracted (hybrid) | 100% | 100% | 100% |
Oracle-split LongMemEval isolates the correct memory in a small corpus. Every local MCP memory system claims ~99% here. It measures "does the ranker work on clean inputs" — nothing more.
Same 30 questions + 1000 random distractor memories drawn from other questions' haystacks. The correct memory now competes against real noise.
| Embedder | Distractors | N | R@5 | R@10 | MRR |
|---|---|---|---|---|---|
Ollama nomic-embed-text | 0 (oracle) | 30 | 100% | 100% | 100% |
Ollama nomic-embed-text | 200 | 5 | 100% | 100% | 100% |
Ollama nomic-embed-text | 500 | 30 | 93.3% | 93.3% | 80.3% |
Ollama nomic-embed-text | 1000 | 5 | 100% | 100% | 74% |
Reproduce: npx tsx eval/longmemeval-distractor-runner.ts --limit 5 --distractors 1000
Sample sizes. The 500-distractor row is n=30 (Wilson 95% CI for 28/30 ≈ 78-99% R@5). The 1000-distractor row is n=5 — preliminary, Wilson 95% CI [57%, 100%]. The 1000-distractor n=30 run takes ~36 min on a single Ollama instance; cached-distractor batching is v0.19.0 work. Treat 500-distractor numbers as defensible, 1000-distractor as directionally positive but underpowered.
Head-to-head comparison is explicit v0.19.0 work. Hindsight (local OSS MCP, ~94.6% LongMemEval claimed) and Mem0/Zep publish their own numbers on their own harnesses. Until we run all of them against the same corpus + embedder, calling any local MCP server "state of the art" is marketing, not measurement. neuromcp publishes its numbers with sample-size caveats so you can judge direction; don't read absolute superiority into them yet.
Hybrid ranker (BM25 + vector + attention + graph + usefulness prior) keeps R@5 = 100% at 1000:1 distractor:target ratio on the observed sample. MRR drops to 74% because the correct memory is sometimes not rank-1 but always rank ≤ 5 in what we saw. Earlier v0.18.0 numbers (R@5 23%) were from a test FakeEmbedder — fixed in v0.18.1.
What this benchmark does NOT prove: end-to-end answer correctness, long-horizon multi-session reasoning, or superiority over commercial cloud systems (Mem0, Zep) on their own benchmarks. Those comparisons need their numbers on the same distractor split, which hasn't been published.
AI agents forget everything between sessions. Existing solutions either store flat key-value pairs (useless for real knowledge) or require cloud infrastructure and API keys.
neuromcp gives you two layers of memory:
Inspired by Karpathy's LLM Wiki, Mastra's Observational Memory, and Zep's temporal knowledge graphs — but simpler than all of them. No vector DB, no embeddings pipeline, no cloud. Just Markdown files + Git + hooks.
~/.neuromcp/
├── memory.db ← SQLite: hybrid search, MCP tools
├── wiki/ ← Compiled knowledge (git-tracked)
│ ├── index.md ← Routekaart — LLM reads this FIRST
│ ├── schema.md ← Operating rules for the LLM
│ ├── log.md ← Append-only changelog
│ ├── people/ ← User profiles, preferences
│ ├── projects/ ← Project knowledge (stack, auth, URLs)
│ ├── systems/ ← Infrastructure (tools, MCP servers)
│ ├── patterns/ ← Reusable patterns (error fixes, routing)
│ ├── decisions/ ← Architecture decisions with context
│ └── skills/ ← Repeatable procedures
└── raw/sessions/ ← Raw session logs (auto-generated)
| When | What happens |
|---|---|
| Session start | Hook injects index.md + user profile + auto-detected project page (~1300 tokens) |
| During session | LLM updates wiki pages when learning something persistent |
| Every 8 tool calls | Hook reminds LLM to update the wiki |
| Session end | Hook writes raw session log + git auto-commits all wiki changes |
| Crash | Checkpoint every 5 tool calls to file. Git history for rollback. |
Every ~4h the launchd agent runs run-consolidation.sh, which
orchestrates four steps end-to-end:
consolidate-sessions.py — batches raw sessions per project,
asks Claude for a factual summary, and fact-checks it against the
raw sources. When the auditor flags specific unsupported claims the
consolidator now auto-strips those lines and re-audits once — so
one speculative sentence no longer kills a whole batch.rescue-rejected.py — any batch that still fails is parsed,
the unsupported claims are removed, and the cleaned summary is
appended to its wiki page. Pure text surgery, no LLM calls.entity-linker.py — cross-links every page: a bare-word mention
of another registered entity (people/, projects/, systems/) is added
to the page's related: frontmatter. Makes the wiki act like a
graph without a separate graph database.rebuild-index.py — regenerates index.md and per-category
-index.md files. Categories over 10 pages are auto-split so the
session-start router stays compact as the wiki scales.The pipeline is idempotent — safe to re-run at any time.
Schema (operating rules) → How to maintain the wiki
Index (knowledge map) → What knowledge exists
User profile → Who you are, how you work
Project page → Current project details (auto-detected from cwd)
Last session → What happened last time
npx neuromcp
npx neuromcp-init-wiki
This creates the wiki structure, installs hooks (Claude Code) and rules (other editors), and configures everything automatically. Without this step, npx neuromcp still runs as a plain MCP server with 42 tools, but the critic hook that closes the attribution loop is not installed — retrieval works but usefulness scores never accumulate. Safe to run multiple times — won't overwrite existing config.
neuromcp works with any MCP-compatible editor. Two tiers of integration:
| Feature | Claude Code | Cursor / Windsurf / Cline / Copilot / JetBrains / Zed |
|---|---|---|
| MCP tools (40+) | Full | Full |
| Context at session start | Hooks (automatic) | Rules (LLM-driven, best-effort) |
| Persist at session end | Hooks (automatic) | Rules (LLM-driven, best-effort) |
| Wiki reminders | Every 8 tool calls | No |
| Crash-resilient checkpoints | Yes | No |
Claude Code gets the full experience via native hooks — context injection and persistence happen automatically, even if the LLM forgets.
Other editors get rules files that instruct the LLM to call neuromcp tools at session start/end. This depends on LLM compliance — it works well in practice but is not guaranteed like hooks.
# Auto-detect installed editors
npx neuromcp-init-wiki
# Target a specific editor
npx neuromcp-init-wiki --editor cursor
# Install rules for all supported editors
npx neuromcp-init-wiki --editor all
Supported editors: cursor, windsurf, cline, copilot (VS Code), jetbrains, zed
ollama pull nomic-embed-text
neuromcp auto-detects it. No config needed.
// ~/.claude.json → mcpServers
{
"neuromcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "neuromcp"]
}
}
// ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"neuromcp": {
"command": "npx",
"args": ["-y", "neuromcp"]
}
}
}
Same format — add to your editor's MCP settings.
// .mcp.json in project root
{
"mcpServers": {
"neuromcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "neuromcp"],
"env": {
"NEUROMCP_DB_PATH": ".neuromcp/memory.db",
"NEUROMCP_NAMESPACE": "my-project"
}
}
}
}
| Tool | Description |
|---|---|
store_memory | Store with semantic dedup, contradiction detection, surprise scoring, entity extraction. |
search_memory | Hybrid vector + FTS search with RRF ranking, graph boost, cognitive priming. Returns explain metadata (trust, contradictions, claims, confidence). |
recall_memory | Retrieve by ID, namespace, category, or tags — no semantic search. |
forget_memory | Soft-delete (tombstone). Supports dry_run. |
consolidate | Dedup, decay, prune, sweep. commit=false for preview, true to apply. |
memory_stats | Counts, categories, trust distribution, DB size. |
export_memories | Export as JSONL or JSON. |
import_memories | Import with content-hash dedup. |
search_all | Unified search across extracted memories and verbatim text with source labels. |
| Tool | Description |
|---|---|
store_verbatim | Store raw conversation text — no summarization, never pruned. |
search_verbatim | Full-text search (FTS5) on verbatim entries for exact recall. |
verbatim_stats | Stats on verbatim storage: total entries, size, distribution. |
| URI | Description |
|---|---|
memory://stats | Global statistics |
memory://recent | Last 20 memories |
memory://namespaces | All namespaces with counts |
memory://health | Server health + metrics |
memory://stats/{namespace} | Per-namespace stats |
memory://recent/{namespace} | Recent in namespace |
memory://id/{id} | Single memory by ID |
memory://tag/{tag} | Memories by tag |
memory://namespace/{ns} | All in namespace |
memory://consolidation/log | Recent consolidation entries |
memory://operations | Active/recent operations |
| Prompt | Description |
|---|---|
memory_context_for_task | Search relevant memories and format as LLM context |
review_memory_candidate | Show proposed memory alongside near-duplicates |
consolidation_dry_run | Preview consolidation without applying |
The wiki is the compiled, human-readable knowledge layer. It replaces the chaos of session logs with structured, interlinked Markdown pages.
| Traditional RAG | neuromcp Wiki |
|---|---|
| Re-derives answers every query | Knowledge compiled once, refined over time |
| Chunking artifacts, retrieval noise | Human-readable pages with source citations |
| Vector DB, embedding pipeline | Plain Markdown + Git |
| Black box retrieval | Auditable, editable, portable |
| Knowledge evaporates | Knowledge compounds |
---
title: My Project
type: project
created: 2026-04-06
updated: 2026-04-06
confidence: high
related: [other-project, oauth-setup]
---
# My Project
Description, stack, auth, deployment details...
The wiki works automatically once hooks are installed. The LLM:
index.md at session start to know what knowledge existsYou can also browse and edit the wiki manually — it's just Markdown files.
Once you accumulate raw session logs, the wiki can be kept fresh automatically. A scheduled job reads unprocessed sessions, groups them per project (by detecting $HOME/projects/<name> paths in the session content), and uses the claude CLI to synthesise a ## [date] entry into the right wiki page.
npx neuromcp-enable-consolidation
What it installs:
~/.neuromcp/scripts/consolidate-sessions.py — the worker~/.neuromcp/scripts/run-consolidation.sh — threshold-guarded runnercom.neuromcp.consolidate)Requirements:
python3 ≥ 3.8 on PATHclaude CLI on PATHGuards built in:
~/.neuromcp/consolidation-ledger.json) makes re-runs idempotentclaude call; override with --max-sessions)Uninstall: npx neuromcp-enable-consolidation --uninstall
Change interval: npx neuromcp-enable-consolidation --interval 7200 (every 2 hours)
Hallucination guard (eval-loop). Every consolidator output goes through a second Haiku audit before the wiki is touched. If any factual claim in the generated summary is not traceable to the raw sessions, the chunk goes to ~/.neuromcp/review-queue/ instead of the wiki. No hallucinated claims leak through.
Atomic facts with temporal supersession. After a summary is approved, it is also distilled into short standalone facts and stored as category='fact' rows with valid_from=today. When a new fact is Jaccard-similar to an existing one in the same project, Haiku decides whether NEW supersedes OLD — if yes, the old row gets superseded_by_id and valid_to set. Retrieval defaults to current facts only (superseded_by_id IS NULL), so outdated conclusions never resurface.
Once the wiki has content, make it searchable so the UserPromptSubmit hook can surface relevant pages automatically (no more "LLM must remember to call search"):
npx neuromcp-index-wiki # index wiki pages into memories_fts + memories_vec
npx neuromcp-index-wiki --rebuild # wipe wiki entries first, then reindex
npx neuromcp-index-wiki --dry-run # preview what would change
npx neuromcp-index-wiki --no-embed # FTS-only mode (no embedding provider needed)
npx neuromcp-backfill-embeddings # embed any memory still missing a vector
The indexer splits each page on ## section headers and stores every section as a deduplicated memory (source='wiki', category='wiki'). Each section is both written to the FTS5 index and embedded via the configured provider (Ollama → OpenAI → ONNX) so vector search works too.
At prompt time the neuromcp-auto-retrieve.js hook calls neuromcp-query, which runs FTS5 BM25 and sqlite-vec cosine search in parallel and fuses the rankings via Reciprocal Rank Fusion (k=60). The top-3 merged results are injected as <neuromcp-recall> context.
The hook is installed automatically by neuromcp-init-wiki and registered under UserPromptSubmit in Claude Code's settings.json. Re-run the indexer after large wiki updates (or schedule it — it's idempotent).
Tuning:
| Env var | Default | Purpose |
|---|---|---|
NEUROMCP_BM25_THRESHOLD | -1.0 | Stricter (more negative) = fewer weak keyword matches |
NEUROMCP_QUERY_BIN | auto-detect | Override the neuromcp-query binary path |
NEUROMCP_NO_EMBED | 0 | Set to 1 to force FTS-only indexing |
NEUROMCP_CONTRADICTION_CHECK | 1 | Set to 0 to skip Haiku supersession judgments |
NEUROMCP_AUDIT_FAIL_OPEN | 0 | Set to 1 to bypass the consolidator audit on infrastructure failure (default is fail-CLOSED) |
memories_vec does not reclaim space after DELETE — sqlite-vec #54 / #265. When you re-index after editing wiki sections, the old vector rows are marked deleted but their storage stays. The database file grows monotonically until you run npx neuromcp-index-wiki --rebuild, which drops and re-creates the vector rows. Run a rebuild every few weeks if you edit the wiki heavily.
claude CLI streaming hangs from non-TTY subprocesses on macOS — if you script interactions with claude -p from another process (e.g. scheduled jobs), pipe it through script -q /dev/null to allocate a pseudo-TTY. Without that the stdout buffer never flushes. We work around this inside the consolidator where needed.
Namespaces isolate memories by project, agent, or domain.
Trust levels (high, medium, low, unverified) rank search results and control decay resistance.
Soft delete tombstones memories — recoverable for 30 days.
Content hashing (SHA-256) deduplicates at write time.
Lineage tracking records source, project ID, and agent ID per memory.
All via environment variables. Defaults work for most setups.
| Variable | Default | Description |
|---|---|---|
NEUROMCP_DB_PATH | ~/.neuromcp/memory.db | Database file path |
NEUROMCP_EMBEDDING_PROVIDER | auto | auto, onnx, ollama, openai |
NEUROMCP_DEFAULT_NAMESPACE | default | Default namespace |
NEUROMCP_AUTO_CONSOLIDATE | false | Enable periodic consolidation |
NEUROMCP_TOMBSTONE_TTL_DAYS | 30 | Days before permanent sweep |
NEUROMCP_LOG_LEVEL | info | debug, info, warn, error |
Session hooks automatically extract high-signal events — no manual store_memory calls needed:
| Detected | Category | How |
|---|---|---|
| CronCreate / ScheduleWakeup calls | intent | Regex on transcript |
| "Remember this" / "Onthoud dit" | decision | Pattern matching |
| Domain monitoring (whois checks) | intent | Command detection |
| Key decisions ("we decided...") | decision | Language patterns |
| Deployments (npm publish, etc.) | event | Command detection |
Auto-captured memories now go through the full store pipeline: dedup, contradiction detection, embeddings, entity extraction, and claims — via HTTP endpoint (POST /api/store). Falls back to raw SQL when HTTP is unavailable.
Contradiction resolution now has three tiers:
contradicts edge in knowledge graphEvery search_memory result includes an explain field:
{
"explain": {
"source_trust": { "level": "high", "reason": "Directly provided by user" },
"temporal_validity": { "currently_valid": true, "superseded_by": null },
"contradictions": [{ "memory_id": "abc", "content_preview": "...", "resolution": "coexist" }],
"claims": [{ "subject": "neuromcp", "predicate": "version", "object": "0.9.2" }],
"confidence": { "retrieval_score": 0.016, "source_trust_score": 1.0, "overall": 0.85 }
}
}
We publish all of this — schema versions, consolidation math, critic output, benchmark numbers with CIs — so you can audit exactly what the system remembers and how. If another local-first system publishes the same or better, link welcome.
| Feature | neuromcp | Hindsight | Mem0 | Letta/MemGPT | agentmemory |
|---|---|---|---|---|---|
| LongMemEval R@5 (oracle) | 99.8% | — | — | — | — |
| LongMemEval R@5 (1000 distractors, n=5, Ollama) | 100% (preliminary, CI [57%, 100%]) | not published | not published | not published | not published |
| Search | Hybrid (vector + FTS + RRF + graph) | Vector + rerank | Vector | Vector | Vector |
| Auto-capture | Deterministic (no LLM cost) | LLM extraction | No | Agent self-edit | Yes |
| Explain mode | Yes (trust, contradictions, claims) | No | No | No | No |
| Knowledge graph | Entities, relations, PageRank | Entities + beliefs | No | No | No |
| Contradiction detection | 3-tier (supersede/coexist/flag) + graph edges | Belief updating | No | No | No |
| Temporal validity | valid_from/valid_to on memories + relations | Yes | No | No | No |
| Wiki knowledge base | Compiled Markdown + Git | No | No | Tiered blocks | No |
| Local-first | SQLite, zero cloud | SQLite | Cloud / Postgres | Server | Local |
| Embeddings | Built-in ONNX (zero config) + Ollama | External | External API | External | External |
| Governance | Namespaces, trust levels, soft delete | Namespaces | API keys | Agent-scoped | Cross-agent |
| Infrastructure | Zero | Zero | Cloud account | Server | Zero |
| Pricing | Free (MIT) | Free (MIT) | Freemium ($23.9M funded) | Free ($10M funded) | Free (Apache-2.0) |
MIT
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.