Server data from the Official MCP Registry
Local-first memory system for AI agents with hybrid search and graph reasoning
Local-first memory system for AI agents with hybrid search and graph reasoning
Valid MCP server (1 strong, 1 medium validity signals). No known CVEs in dependencies. Package registry verified. Imported from the Official MCP Registry.
14 files analyzed · 1 issue found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-tobs-code-cozo-memory": {
"args": [
"-y",
"cozo-memory"
],
"command": "npx"
}
}
}From the project's GitHub README.
Why Cozo Memory?
LLMs have short-term memory limits. Standard RAG retrieves documents but can't connect facts across time. Cozo Memory gives your AI agent persistent, structured memory – it remembers past conversations, infers relationships, detects contradictions, and explores its knowledge graph – fully on your machine, with optional local LLM integration via Ollama for intelligent actions (cleanup, reflection, summarization, agentic routing).Most memory stacks combine separate databases: SQLite for facts, Chroma for vector search, NetworkX for graphs. CozoDB replaces all of that with one embedded engine: relational, graph, vector, and full-text search in a single query language, one file, zero sync lag.
Local-first memory for Claude & AI agents with hybrid search, Graph-RAG, and time-travel – runs entirely on your machine. Optional Ollama integration enables LLM-powered actions (cleanup, reflect, summarize, agentic retrieval).
# Install globally
npm install -g cozo-memory
# Or run directly with npx (no installation needed)
npx cozo-memory
git clone https://github.com/tobs-code/cozo-memory
cd cozo-memory
npm install && npm run build
npm run start
Now add the server to your MCP client (e.g. Claude Desktop) – see Integration below.
🔍 Hybrid Search - Combines semantic (HNSW), full-text (FTS), and graph signals via Reciprocal Rank Fusion for intelligent retrieval
🧠 Agentic Retrieval - Auto-routing engine analyzes query intent via local LLM to select optimal search strategy (Vector, Graph, or Community)
⏱️ Time-Travel Queries - Version all changes via CozoDB Validity; query any point in history with full audit trails
🎯 GraphRAG-R1-Inspired Adaptive Retrieval - Intelligent system with Progressive Retrieval Attenuation (PRA) and Cost-Aware F1 (CAF) scoring, conceptually inspired by GraphRAG-R1 (Yu et al., WWW 2026) and adapted for CozoDB, that learns from usage
⏳ Temporal Conflict Resolution - Automatic detection and resolution of contradictory observations with semantic analysis and audit preservation
🏠 100% Local - Embeddings via ONNX/Transformers; data stays on your machine. Some advanced features (cleanup, reflect, summarize, agentic search) require an optional Ollama service for local LLM inference — but the core search, CRUD, and graph operations work without any LLM.
🧠 Multi-Hop Reasoning - Logic-aware graph traversal with vector pivots for deep relational reasoning
🗂️ Hierarchical Memory - Multi-level architecture (L0-L3) with intelligent compression and LLM-backed summarization
A common first question is: "Why not just combine existing tools?"
| If you need... | Typical separate stack | CozoDB Memory |
|---|---|---|
| Structured data & relations | SQLite / PostgreSQL | ✅ Built-in relational engine |
| Semantic / vector search | Chroma / Qdrant / Pinecone | ✅ HNSW + FTS + RRF in one engine |
| Graph traversal & reasoning | NetworkX / Neo4j | ✅ Native graph queries + PageRank |
| Time-travel / versioning | Custom audit tables | ✅ Built-in Validity time-travel |
| Unified query language | Multiple APIs + glue code | ✅ Single Datalog query across all dimensions |
The core insight: Most memory stacks bolt vector search onto a graph DB, or graph search onto a vector DB. CozoDB is different: it is a single engine that natively combines relational, graph, vector, and full-text search. That means:
Most "Memory" MCP servers fall into two categories:
This server fills the gap in between ("Sweet Spot"): A local, database-backed memory engine combining vector, graph, and keyword signals — powered by CozoDB's unified engine rather than a patchwork of separate databases.
| Feature | CozoDB Memory (This Project) | Official Reference (@modelcontextprotocol/server-memory) | mcp-memory-service (Community) | Database Adapters (Qdrant/Neo4j) |
|---|---|---|---|---|
| Backend | CozoDB (Graph + Vector + Relational + FTS in one engine) | JSON file (memory.jsonl) | SQLite / Cloudflare | Specialized DB (only Vector or Graph) |
| Search Logic | Agentic (Auto-Route): Hybrid + Graph + Summaries | Keyword only / Exact Graph Match | Vector + Keyword | Mostly only one dimension |
| Inference | Yes: Built-in engine for implicit knowledge | No | No ("Dreaming" is consolidation) | No (Retrieval only) |
| Community | Yes: Hierarchical Community Summaries | No | No | Only clustering (no summary) |
| Time-Travel | Yes: Queries at any point in time (Validity) | No (current state only) | History available, no native DB feature | No |
| Maintenance | Janitor: LLM-backed cleanup | Manual | Automatic consolidation | Mostly manual |
| Deployment | Local (Node.js + Embedded DB) | Local (Docker/NPX) | Local or Cloud | Often requires external DB server |
The core advantage is Intelligence and Traceability: By combining an Agentic Retrieval Layer with Hierarchical GraphRAG, the system can answer both specific factual questions and broad thematic queries with much higher accuracy than pure vector stores.
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2 – only ~400 MB RAM needed (see Embedding Model Options)cozo-nodeSome advanced actions use a local LLM via Ollama for intelligent processing. The core server works without Ollama (CRUD, search, graph operations), but the following actions require it:
| Action | Purpose |
|---|---|
cleanup | LLM-backed observation consolidation |
reflect | Generate insights, detect contradictions |
summarize_communities | LLM-generated community summaries |
compact | Session / entity compaction with LLM summarization |
agentic_search | Query intent classification for auto-routing |
Setup (if you need these features):
# 1. Install Ollama from https://ollama.ai
# 2. Pull a model (e.g. small + fast for dev):
ollama pull demyagent-4b-i1:Q6_K
# 3. Ollama runs automatically on http://localhost:11434
If Ollama is not running, the affected actions gracefully fall back to non-LLM behavior (where possible) or return a clear error message.
# Install globally
npm install -g cozo-memory
# Or use npx without installation
npx cozo-memory
git clone https://github.com/tobs-code/cozo-memory
cd cozo-memory
npm install
npm run build
npm install
npm run build
npm run start
Notes:
@xenova/transformers downloads the embedding model (may take time)CozoDB Memory supports multiple embedding models via the EMBEDDING_MODEL environment variable:
| Model | Size | RAM | Dimensions | Best For |
|---|---|---|---|---|
Xenova/bge-m3 (default) | ~600 MB | ~1.7 GB | 1024 | High accuracy, production use |
Xenova/all-MiniLM-L6-v2 | ~80 MB | ~400 MB | 384 | Low-spec machines, development |
Xenova/bge-small-en-v1.5 | ~130 MB | ~600 MB | 384 | Balanced performance |
Configuration Options:
Option 1: Using .env file (Easiest for beginners)
# Copy the example file
cp .env.example .env
# Edit .env and set your preferred model
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2
Option 2: MCP Server Config (For Claude Desktop / Kiro)
{
"mcpServers": {
"cozo-memory": {
"command": "npx",
"args": ["cozo-memory"],
"env": {
"EMBEDDING_MODEL": "Xenova/all-MiniLM-L6-v2"
}
}
}
}
Option 3: Command Line
# Use lightweight model for development
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2 npm run start
Download Model First (Recommended):
# Set model in .env or via command line, then:
EMBEDDING_MODEL=Xenova/all-MiniLM-L6-v2 npm run download-model
Note: Changing models requires re-embedding existing data. The model is downloaded once on first use.
{
"mcpServers": {
"cozo-memory": {
"command": "npx",
"args": ["cozo-memory"]
}
}
}
{
"mcpServers": {
"cozo-memory": {
"command": "cozo-memory"
}
}
}
{
"mcpServers": {
"cozo-memory": {
"command": "node",
"args": ["C:/Path/to/cozo-memory/dist/index.js"]
}
}
}
Official adapters for seamless integration with popular AI frameworks:
🦜 LangChain Adapter
npm install @cozo-memory/langchain @cozo-memory/adapters-core
import { CozoMemoryChatHistory, CozoMemoryRetriever } from '@cozo-memory/langchain';
const chatHistory = new CozoMemoryChatHistory({ sessionName: 'user-123' });
const retriever = new CozoMemoryRetriever({ useGraphRAG: true, graphRAGDepth: 2 });
🦙 LlamaIndex Adapter
npm install @cozo-memory/llamaindex @cozo-memory/adapters-core
import { CozoVectorStore } from '@cozo-memory/llamaindex';
const vectorStore = new CozoVectorStore({ useGraphRAG: true });
Documentation: See adapters/README.md for complete examples and API reference.
Full-featured CLI for all operations:
# System operations
cozo-memory system health
cozo-memory system metrics
# Entity operations
cozo-memory entity create -n "MyEntity" -t "person"
cozo-memory entity get -i <entity-id>
# Search
cozo-memory search query -q "search term" -l 10
cozo-memory search agentic -q "agentic query"
# Graph operations
cozo-memory graph pagerank
cozo-memory graph communities
# Export/Import
cozo-memory export json -o backup.json
cozo-memory import file -i data.json -f cozo
# All commands support -f json or -f pretty for output formatting
See CLI help for complete command reference:
cozo-memory --help
Interactive TUI with mouse support powered by Python Textual:
# Install Python dependencies (one-time)
pip install textual
# Launch TUI
npm run tui
# or directly:
cozo-memory-tui
TUI Features:
graph TB
Client[MCP Client<br/>Claude Desktop, etc.]
Server[MCP Server<br/>FastMCP + Zod Schemas]
Services[Memory Services]
Embeddings[Embeddings<br/>ONNX Runtime]
Search[Hybrid Search<br/>RRF Fusion]
Cache[Semantic Cache<br/>L1 + L2]
Inference[Inference Engine<br/>Multi-Strategy]
DB[(CozoDB SQLite<br/>Relations + Validity<br/>HNSW Indices<br/>Datalog/Graph)]
Client -->|stdio| Server
Server --> Services
Services --> Embeddings
Services --> Search
Services --> Cache
Services --> Inference
Services --> DB
style Client fill:#e1f5ff,color:#000
style Server fill:#fff4e1,color:#000
style Services fill:#f0e1ff,color:#000
style DB fill:#e1ffe1,color:#000
See docs/ARCHITECTURE.md for detailed architecture documentation
The interface is reduced to 5 consolidated tools:
| Tool | Purpose | Key Actions |
|---|---|---|
mutate_memory | Write operations | create_entity, update_entity, delete_entity, add_observation, create_relation, transactions, sessions, tasks, update_observation, batch_delete, manage_tags, batch |
query_memory | Read operations | search, advancedSearch, context, graph_rag, graph_walking, agentic_search, adaptive_retrieval, list_entities, get_entity_detail, get_session_context, list_sessions |
analyze_graph | Graph analysis | explore, communities, pagerank, betweenness, hits, shortest_path, semantic_walk |
manage_system | Maintenance | health, metrics, stats, export, import, cleanup, defrag, reflect, snapshots |
edit_user_profile | User preferences | Edit global user profile with preferences and work style |
See docs/API.md for complete API reference with all parameters and examples
First Start Takes Long
LLM-powered actions require Ollama
cleanup, reflect, summarize_communities, compact, agentic_searchollama pull demyagent-4b-i1:Q6_K (or your preferred model)Windows-Specific
Performance Issues
health action to check cache hit ratesSee docs/BENCHMARKS.md for performance optimization tips
src/index.ts: MCP Server + Tool Registrationsrc/memory-service.ts: Core business logicsrc/db-service.ts: Database operationssrc/embedding-service.ts: Embedding Pipeline + Cachesrc/hybrid-search.ts: Search Strategies + RRFsrc/inference-engine.ts: Inference Strategiessrc/api_bridge.ts: Express API Bridge (optional)npm run build # TypeScript Build
npm run dev # ts-node Start of MCP Server
npm run start # Starts dist/index.js (stdio)
npm run bridge # Build + Start of API Bridge
npm run benchmark # Runs performance tests
npm run eval # Runs evaluation suite
Contributions are welcome! Please see CONTRIBUTING.md for guidelines.
Apache 2.0 - See LICENSE for details.
Built with:
Research foundations:
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Web content fetching and conversion for efficient LLM usage
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.