Server data from the Official MCP Registry
Local-first agentic memory for MCP agents โ 25 tools, hybrid search, GDPR, no cloud.
Local-first agentic memory for MCP agents โ 25 tools, hybrid search, GDPR, no cloud.
Valid MCP server (2 strong, 1 medium validity signals). No known CVEs in dependencies. Package registry verified. Imported from the Official MCP Registry.
8 files analyzed ยท 1 issue found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-skynetcmd-m3-memory": {
"args": [
"m3-memory"
],
"command": "uvx"
}
}
}From the project's GitHub README.
![M3 Memory]
Persistent, local memory for MCP agents.
"Wait, you remember that?" โ Stop re-explaining your project to your AI. Give it a long-term brain that stays 100% on your machine.
๐ New to M3? Start here with our 5-minute "Human-First" guide.
Works with Claude Code, Gemini CLI, Aider, OpenCode, and any MCP-compatible agent. Quick one-line command to have your agent install chat log sub-system which saves verbatim chat log info, before compaction, with zero lag/latency and 100% retrieval recall. Just tell your AI agent "install m3-memory chat log sub-system" and your agent will automatically install it with all the proper hooks with some minimal customization questions from you (you can accept the default answers).
pip install m3-memory
That's it. On first mcp-memory run, the CLI auto-fetches the system
payload into ~/.m3-memory/repo/ (pinned to the wheel version) โ with
an interactive confirmation prompt if you run it in a terminal, silently
if an MCP client is launching the server. Set M3_AUTO_INSTALL=0 to
skip the auto-fetch and run mcp-memory install-m3 explicitly, or
mcp-memory doctor to verify paths.
Add to your MCP config:
{
"mcpServers": {
"memory": { "command": "mcp-memory" }
}
}
Requires a local embedding model. Ollama maybe the easiest:
ollama pull qwen3-embedding:0.6b && ollama serve
I personally use LM Studio for it's support for Apple's MLX. But, it's your preference.
Qwen3-Embedding-0.6B (1024-dim, Q8 quantized, ~639 MB) is the model M3 Memory is tuned for. But you can set and use other embedders as you wish. For example, you could use nomic-embed-text (768-dim) which also works (with minimal fidelity loss) โ set EMBED_MODEL=nomic-embed-text in your environment.
As mentioned, you can use Ollama or LM Studio โ load an embedding model and start its server.
Want auto-classification, summarization, and consolidation? Load a small chat model alongside the embedder (e.g. qwen2.5:0.5b via Ollama, or any 0.5โ1B instruct GGUF in LM Studio / llama.cpp). M3 auto-selects it; embedding-only features work without it. See docs/QUICKSTART.md โ Optional: load a small chat model.
Restart your agent. Done!
You're at a coffee shop on your MacBook, asking Claude to debug a deployment issue. It remembers the architecture decisions you made last week, the server configs you stored yesterday, and the troubleshooting steps that worked last time โ all from local SQLite, no internet required.
Later, you're at your Windows desktop at home with Gemini CLI, and it picks up exactly where you left off. Same memories, same context, same knowledge graph. You didn't copy files, didn't export anything, didn't push to someone else's cloud. Your PostgreSQL sync handled everything in the background the moment your laptop hit the local network.
Most AI agents don't persist state between sessions. You re-paste context, re-explain architecture, re-correct mistakes. When facts change, the agent has no mechanism to update what it "knows."
M3 Memory gives agents a structured, persistent memory layer that handles this.
Persistent memory โ facts, decisions, preferences survive across sessions. Stored in local SQLite.
Hybrid retrieval โ FTS5 keyword matching + semantic vector similarity + MMR diversity re-ranking. Automatic, no tuning required.
Contradiction handling โ conflicting facts are automatically superseded. Bitemporal versioning preserves the full history.
Knowledge graph โ related memories linked automatically on write. Nine relationship types, 3-hop traversal.
Zero-config local install โ pip install m3-memory, one line in your MCP config, done. SQLite stores everything locally โ no external databases, no cloud calls, no API costs. Works offline.
Cross-device sync โ optional, easy-to-add bi-directional delta sync via PostgreSQL or ChromaDB. Set one environment variable and your memories follow you across machines.
| ๐ Getting started | ๐ฅ Multi-agent orchestration |
| โจ Core features | ๐งฉ Multi-agent example |
| ๐๏ธ System design | โ๏ธ M3 vs alternatives |
| ๐ง Implementation details | โ๏ธ Configuration |
| ๐ค Agent rules + all 66 tools | ๐บ๏ธ Roadmap |
| Good fit | Not the right tool |
|---|---|
| You use Claude Code, Gemini CLI, Aider, or any MCP agent โ plus non-MCP clients via the built-in HTTP proxy server | You need LangChain/CrewAI pipeline memory โ see Mem0 |
| You're coordinating multiple agents on a shared local store | You need a hosted agent runtime with managed scaling โ see Letta |
| You need GDPR primitives, bitemporal state, or pure SQLite | You want state-of-the-art retrieval benchmarks today โ see Hindsight |
| You want memory that persists across sessions and devices | You only need in-session chat context |
| 66 MCP tools | Memory, search, GDPR, refresh lifecycle โ plus agent registry, handoffs, notifications, tasks, and chat-log capture for multi-agent orchestration |
| 193 end-to-end tests | Covering write, search, contradiction, sync, GDPR, maintenance, and orchestration paths |
| Explainable retrieval | memory_suggest returns vector, BM25, and MMR scores per result |
| SQLite core | No external database required. Single-file, portable, inspectable |
| GDPR compliance | gdpr_forget (Article 17) and gdpr_export (Article 20) as built-in tools |
| Self-maintaining | Automatic decay, dedup, orphan pruning, retention enforcement |
| Apache 2.0 licensed | Free. No SaaS tier, no usage limits, no lock-in |
89.0% on LongMemEval-S (445/500 correct) โ a 500-question evaluation of long-horizon conversational memory. Without oracle metadata: 74.8% (smart retrieval) to 68.0% (fixed-k baseline).
| Question type | n | Accuracy |
|---|---|---|
| single-session-user | 70 | 91.4% |
| single-session-assistant | 56 | 94.6% |
| single-session-preference | 30 | 93.3% |
| multi-session | 133 | 85.0% |
| temporal-reasoning | 133 | 86.5% |
| knowledge-update | 78 | 92.3% |
| Overall | 500 | 89.0% |
Full methodology, ablations, and honest caveats: benchmarks/longmemeval/README.md.
Most sessions use three tools. The rest is there when you need it.
| Tool | Purpose |
|---|---|
memory_write | Store a fact, decision, preference, config, or observation |
memory_search | Retrieve relevant memories (hybrid search) |
memory_update | Refine existing knowledge |
memory_suggest | Search with full score breakdown |
memory_get | Fetch a specific memory by ID |
All 66 tools are documented in docs/AGENT_INSTRUCTIONS.md.
M3 Memory exposes 66 MCP tools for storing, searching, updating, and linking knowledge โ including conversation grouping, a refresh lifecycle for aging memories, agent registry, handoffs, notifications, tasks, and chat-log capture for multi-agent orchestration. Any MCP-compatible agent can use them automatically.
To teach your agent best practices (search before answering, write aggressively, update instead of duplicating), drop the compact rules file into your project:
examples/AGENT_RULES.md
Full tool reference with all parameters and behaviors: docs/AGENT_INSTRUCTIONS.md
Already inside Claude Code or Gemini CLI? Paste one of these prompts:
Claude Code:
Install m3-memory for persistent memory. Run: pip install m3-memory
Then add {"mcpServers":{"memory":{"command":"mcp-memory"}}} to my
~/.claude/settings.json under "mcpServers". Make sure Ollama is running
with qwen3-embedding:0.6b. Then use /mcp to verify the memory server loaded.
Gemini CLI:
Install m3-memory for persistent memory. Run: pip install m3-memory
Then add {"mcpServers":{"memory":{"command":"mcp-memory"}}} to my
~/.gemini/settings.json under "mcpServers". Make sure Ollama is running
with qwen3-embedding:0.6b.
After install, test it:
Write a memory: "M3 Memory installed successfully on [today's date]"
Then search for: "M3 install"
Want auto-capture of every Claude Code / Gemini CLI / OpenCode / Aider conversation into a searchable, promotable chat log store? Once m3-memory is wired up, just say:
Install the m3-memory chat log subsystem.
The agent runs bin/chatlog_init.py, wires the host-agent hook, and installs the embed sweeper schedule. See docs/CHATLOG.md for the architecture and ops guide.
ย
ย
Contributing ยท Good first issues
Be the first to review this server!
by Modelcontextprotocol ยท Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno ยท Developer Tools
Toleno Network MCP Server โ Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace ยท Developer Tools
Create, build, and publish Python MCP servers to PyPI โ conversationally.