Server data from the Official MCP Registry
Persistent memory for Claude Code and Cursor. Stop re-explaining your project every session.
Persistent memory for Claude Code and Cursor. Stop re-explaining your project every session.
Remote endpoints: streamable-http: https://sumapro.quadframe.work/mcp
Valid MCP server (1 strong, 0 medium validity signals). 2 known CVEs in dependencies (0 critical, 2 high severity) ⚠️ Package registry links to a different repository than scanned source. Imported from the Official MCP Registry.
2 files analyzed · 3 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
Set these up before or after installing:
Environment variable: SUMA_API_KEY
Available as Local & Remote
This plugin can run on your machine or connect to a hosted endpoint. during install.
From the project's GitHub README.
Stop re-explaining your project to Claude every time you start a new chat.
Your repos now have permanent memory. SUMA gives any MCP-compatible AI client (Claude Code, Cursor, Devin) a persistent knowledge graph that remembers architectural decisions, bug root causes, and project rules — across sessions, across machines, across your entire team.
Get an API key at sumapro.quadframe.work — free tier available.
Add to your .mcp.json:
{
"mcpServers": {
"suma-memory": {
"url": "https://sumapro.quadframe.work/mcp",
"headers": {
"Authorization": "Bearer sk_live_your_key_here"
}
}
}
}
That's it. No local server. No Docker. No npm install. SUMA runs on Cloud Run — stateless, auto-scaled, always available.
After installing, run this once per repo to seed your permanent context:
suma_ingest(text="Project: [name]. Framework: [Next.js / Flask / etc].
Auth lives in: [path/to/auth.py]. Database: [PostgreSQL / SQLite / etc].
Rules never to break: [e.g. never store plaintext keys, all routes require org_id filter].
Deployment target: [Cloud Run / Vercel / etc].")
From this point forward, every new session inherits this context. You never explain it again.
SUMA stores knowledge in a weighted graph. Every node has a gravity score across four dimensions:
When you call suma_search, the K-WIL gravity algorithm traverses the graph and returns the highest-relevance context — not a flat list of chunks, not a raw embedding match, but the facts that actually matter for what you're doing right now.
| Tool | What it does |
|---|---|
suma_ping | Health check — verify connection and API key |
suma_ingest | Add knowledge to the graph (architecture decisions, bug fixes, rules) |
suma_search | Retrieve relevant context by natural language query |
suma_talk | Search + learn in one call — retrieves context and updates graph |
suma_correct | Fix wrong information — supersedes original, queues replacement |
suma_clean | Remove noise nodes that pollute search results |
# After finalizing a decision:
suma_ingest(text="We chose REST over GraphQL. Root cause: GraphQL N+1 queries
caused 3x latency on /search. Architect ruling Apr 10 2026.")
# Next session, cold start — full context in one call:
suma_search(query="why did we switch to REST?")
# → Returns ruling with full context. No re-explaining.
# After fixing a hard bug:
suma_ingest(text="Cloud Run WebSocket bug: asyncio.run() in daemon thread killed
by Cloud Run recycling. Fix: use asyncio.get_event_loop() instead.
Never use asyncio.run() in long-lived Cloud Run services.")
# Six months later, same error:
suma_search(query="asyncio cloud run daemon thread crash")
# → Root cause retrieved instantly. Hours saved.
Architect, developer, and QA agents each write to SUMA using their own sessions. Their knowledge merges into one shared org graph. When QA asks "what did the architect decide about auth?", it retrieves the architect's ruling — zero explicit handoff required.
Anti-flood protection: Each source machine is rate-limited to 5 ingests per 60 seconds. Runaway agent loops are broken gracefully — the 6th request returns {"status": "throttled"} without crashing or corrupting the graph.
Multi-tenant isolation: Every node is scoped to org_id at the database layer. Two organizations on the same Cloud Run instance cannot access each other's data — enforced by SQL, not application logic.
Immutable audit trail: suma_correct and suma_clean never delete data. Nodes are superseded and invisible to the API while preserved in storage for compliance.
| Metric | Value |
|---|---|
| Compression ratio | 94.7% — 801 nodes replace 15.2M tokens |
| Cost saved per org | $14.47 across 538 queries |
| K-WIL fidelity | 96.3% — 26/27 facts recoverable from 5-node graph |
| Automated tests | 118 (102 Playwright E2E + 16 pytest) |
| Plan | Queries/month | Price |
|---|---|---|
| Starter | 20,000 | Free |
| Developer | 100,000 | $4.99/mo |
| Team | 500,000 | $29/mo |
| Enterprise | Unlimited | Contact |
Get your key: sumapro.quadframe.work
© 2025–2026 Suman Addanke / A2 Vibe Creators LLC
US Patent applications pending — 6 filed (2025–2026). Unauthorized commercial use prohibited.
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.