Server data from the Official MCP Registry
Bounded working memory for coding agents — typed nodes, current-vs-stale state, MCP-mountable.
Bounded working memory for coding agents — typed nodes, current-vs-stale state, MCP-mountable.
Valid MCP server (1 strong, 3 medium validity signals). No known CVEs in dependencies. Package registry verified. Imported from the Official MCP Registry.
5 files analyzed · 1 issue found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: STATE_TRACE_STORAGE_PATH
Environment variable: STATE_TRACE_NAMESPACE
Environment variable: STATE_TRACE_CAPACITY_LIMIT
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-razroo-state-trace": {
"env": {
"STATE_TRACE_NAMESPACE": "your-state-trace-namespace-here",
"STATE_TRACE_STORAGE_PATH": "your-state-trace-storage-path-here",
"STATE_TRACE_CAPACITY_LIMIT": "your-state-trace-capacity-limit-here"
},
"args": [
"state-trace"
],
"command": "uvx"
}
}
}From the project's GitHub README.
Graph-native working memory for coding agents: typed memories, causal retrieval, bounded capacity, and compact briefs for small models.
state-trace is a bounded working-memory layer for coding and debugging agents that need the right file, failure, and next action under tight token budgets. It is not a replacement for a general-purpose temporal knowledge graph like Graphiti — see ARCHITECTURE.md for the honest comparison.
What it is optimized for:
engine.current_state(), engine.failed_hypotheses())The credibility benchmark. Cold-start artifact localization on the full SWE-bench-Verified test split: given only the GitHub issue text and hints (no trajectory), rank the correct patch file at 1 and at 5.
pip install -e ".[bench]"
python3 examples/swebench_verified_eval.py --limit 500 --backends no_memory bm25 state_trace graphiti
| backend | n | Artifact@1 | Artifact@1 CI | Artifact@5 | Artifact@5 CI | AvgLatencyMs |
|---|---|---|---|---|---|---|
| no_memory | 500 | 0.000 | [0.000, 0.000] | 0.000 | [0.000, 0.000] | 0.00 |
| bm25 | 500 | 0.176 | [0.144, 0.208] | 0.300 | [0.262, 0.338] | 0.10 |
| state_trace | 500 | 0.254 | [0.218, 0.290] | 0.376 | [0.336, 0.414] | 15.04 |
| graphiti | 500 | 0.098 | [0.072, 0.126] | 0.216 | [0.182, 0.254] | 4851.46 |
What this says, plainly:
v0.3.0 landed a module-to-path translator in retrieve_brief's lexical fallback: dotted Python module references in issue text (astropy.modeling.separable_matrix) now resolve to file path candidates (astropy/modeling/separable.py), which pushed A@1 from 0.216 → 0.254 on n=500.
graphiti_head_to_head_eval.py uses for reproducibility without API keys. A full Graphiti pipeline with GPT-4-class extraction might close some of the gap, at materially higher cost per ingest.Localization leads need to be converted into downstream solve wins to matter. Running the actual swebench test suite on patches Codex CLI produces with vs. without a state-trace brief:
| arm | resolved | unresolved | errored | solve-rate |
|---|---|---|---|---|
| state_trace | 7 | 3 | 10 | 7/20 = 35% |
| no_memory | 7 | 2 | 11 | 7/20 = 35% |
Same aggregate solve-rate. But the two arms solve different instances:
Honest read:
Typed coding-agent ontology, not generic Entity/Edge:
task, observation, decision, file, goal, session, command, test, symbol, patch_hunk, error_signature, episodepatches_file, fails_in, verified_by, rejected_by, supersedes, contradicts, solves, derived_from, precedes, motivates, and morelocate_file, failure_analysis, history, general).Bounded working memory as a first-class constraint:
enforce_capacity() runs decay, compression, and summarization on every step.current_state(session) answers "what's live right now" directly — cheap for state-trace, expensive for a general-purpose knowledge graph.failed_hypotheses(session) returns invalidated, superseded, or unrecovered-error nodes — the "don't propose this again" signal.Local-first, MCP-mountable:
networkx.MultiDiGraph. Cold storage is WAL SQLite+FTS5.state-trace-mcp is a stdio MCP server you can mount in Claude Code / Cursor / Codex CLI.See ARCHITECTURE.md for why these choices matter vs. Graphiti, and BENCHMARKS.md for the smaller repo-local benchmarks.
Graphiti is the stronger general-purpose temporal knowledge graph for AI agents. state-trace is narrower: working memory for one coding/debugging session at a time. We're not claiming to replace Graphiti — we're claiming a specific lane where the tradeoffs land differently.
Each row below is a concrete, measured axis, not a vibe.
| Axis | state-trace | Graphiti | Winner for coding agents |
|---|---|---|---|
| Artifact@1 on SWE-bench-Verified, n=500 | 0.254 [0.218, 0.290] | 0.098 [0.072, 0.126] | state-trace — non-overlapping 95% CIs |
| Artifact@5 on SWE-bench-Verified, n=500 | 0.376 [0.336, 0.414] | 0.216 [0.182, 0.254] | state-trace — non-overlapping 95% CIs |
| Per-retrieval latency (same benchmark) | 15 ms | 4,851 ms | state-trace — ~320× faster |
| Write path per agent step | Typed insert, zero LLM calls | add_episode → LLM entity extraction each step | state-trace — cheaper, deterministic, no API key |
| Default deploy | Pure Python + local SQLite/JSON; state-trace-mcp stdio binary | Neo4j / Kuzu / FalkorDB graph DB + embedder + LLM | state-trace — local-first, no external services |
| Coding-agent ontology | Typed: file, patch_hunk, error_signature, test, command, symbol, observation, decision, task, goal, session, episode | Generic EntityNode / EntityEdge / EpisodicNode | state-trace — retrieval scorer routes on these types |
| "What's true right now in this session?" | engine.current_state(session) — direct O(graph) query | Inferred from temporal facts via Cypher or LLM | state-trace — first-class API |
| "What have I already tried and rejected?" | engine.failed_hypotheses(session) — direct query returning invalid_at + superseded + unrecovered-error nodes | Has to be inferred from invalid_at + contradictions | state-trace — first-class API |
| Working-memory capacity bound | enforce_capacity with decay + compression + lifecycle retention. Long-horizon pressure benchmark: Artifact@1 0.771 while staying within a 96-unit budget 100% of the time | Unbounded by design; relies on the graph DB to scale | state-trace for long debugging sessions that need a memory ceiling |
| Small-model brief | retrieve_brief produces ~220-token structured brief (patch_file, rerun_command, tests_to_rerun, failed_attempts, recommended_actions, …) that fits a tight budget | Returns raw nodes/facts; caller compresses | state-trace — built for small-model harnesses |
| MCP-mountable | state-trace-mcp stdio server in the [mcp] extra — 11 tools exposed, drop into ~/.claude/settings.json | No official MCP server; library-first | state-trace — plug straight into Claude Code / Cursor / Codex / opencode |
| Long-lived temporal knowledge across weeks | Scoped to a session or repo namespace; no cross-namespace fact merging | First-class; bi-temporal validity, contradiction resolution, fact supersession across episodes | Graphiti |
| Multi-tenant SaaS scale | Single-writer process model; authoritative graph is in-process networkx | Built for it on Neo4j/Kuzu substrate | Graphiti |
| Cross-session learning about users / orgs / policies | Out of scope | First-class | Graphiti |
Use state-trace when:
Use Graphiti when:
They solve adjacent problems. The only reason a comparison is even interesting is that both ship as "memory for AI agents" — the honest answer is they're different products that happen to live on the same shelf.
uv sync # or: pip install -e .
pip install -e ".[mcp]" # stdio MCP server for Claude Code / Cursor / Codex CLI
pip install -e ".[bench]" # graphiti-core[kuzu] + datasets (for the headline benchmark)
pip install -e ".[llm]" # OpenAI-backed live benchmarks + LLM ingestion
pip install -e ".[adapters]" # LangGraph / LlamaIndex adapter shims
pip install -e ".[api]" # FastAPI app
Distribution name: state-trace. Python import path: state_trace.
from state_trace import MemoryEngine
engine = MemoryEngine(capacity_limit=24.0, storage_path="memory.json")
task = engine.store(
"Fix login by tracing the refresh token path",
{"type": "task", "session": "auth-debug", "goal": "restore login", "file": "auth.ts", "importance": 0.92},
)
engine.store(
"Login still returns 401 after refresh token exchange",
{"type": "observation", "session": "auth-debug", "goal": "restore login", "file": "auth.ts",
"blocks": [task.id], "importance": 0.88},
)
engine.store(
"Authorization header is dropped before the retry request reaches auth.ts",
{"type": "decision", "session": "auth-debug", "goal": "restore login",
"related_to": [task.id], "file": "auth.ts", "importance": 0.91},
)
result = engine.retrieve("Why is login still broken?", {"session": "auth-debug", "goal": "restore login"})
The architectural wedge. These APIs return a live view of the session without re-ranking:
state = engine.current_state(session="auth-debug", goal="restore login")
# → {"active_task": ..., "latest_observation": ..., "active_files": [...], ...}
failures = engine.failed_hypotheses(session="auth-debug")
# → [{"id": ..., "reason": ["superseded"], "content": "Login still returns 401 ..."}, ...]
current_state filters out invalidated and superseded nodes; failed_hypotheses surfaces them as "do not propose again" context. A general-purpose temporal graph has to infer this from fact updates; here it's a direct query.
pip install -e ".[mcp]"
state-trace-mcp
Environment config:
STATE_TRACE_STORAGE_PATH — durable path; .db/.sqlite uses the SQLite backend. Default: ~/.state-trace/memory.db.STATE_TRACE_NAMESPACE — default namespace (e.g. the repo slug).STATE_TRACE_CAPACITY_LIMIT — working-memory budget (default 256).Tools exposed: store, retrieve, retrieve_brief, record_action, record_observation, record_test_result, ingest_agent_log_file, current_state, failed_hypotheses, list_namespaces, graph_snapshot.
Example Claude Code config (~/.claude/settings.json):
{
"mcpServers": {
"state-trace": {
"command": "state-trace-mcp",
"env": {
"STATE_TRACE_STORAGE_PATH": "/Users/me/.state-trace/memory.db",
"STATE_TRACE_NAMESPACE": "repo-x"
}
}
}
}
engine = MemoryEngine(capacity_limit=256.0)
ctx = {"session": "auth-debug", "goal": "restore login", "repo": "example/auth-service"}
engine.record_action('open "src/auth.ts"', {**ctx, "files": ["src/auth.ts"]})
engine.record_observation(
"AttributeError: login still fails with a 401 in src/auth.ts",
{**ctx, "files": ["src/auth.ts"], "status": "error"},
)
engine.record_action('edit "src/auth.ts"', {**ctx, "files": ["src/auth.ts"], "action_kind": "edit"})
engine.record_test_result(
"pytest tests/test_auth.py::test_refresh_retry",
"tests/test_auth.py::test_refresh_retry PASSED",
{**ctx, "files": ["src/auth.ts", "tests/test_auth.py::test_refresh_retry"]},
)
brief = engine.retrieve_brief(
"Which file should I patch and what test should I rerun?",
{"session": "auth-debug", "goal": "restore login"},
mode="small_model",
)
The brief fields: patch_file, rerun_command, target_files, tests_to_rerun, current_state, failed_attempts, recommended_actions, evidence, symbols, patch_hints, confidence, token_estimate.
engine = MemoryEngine(capacity_limit=256.0)
engine.store_agent_log_file("examples/data/agent_logs/marshmallow__marshmallow-1867.json")
Supported inputs: normalized agent_log JSON, raw SWE-agent .traj files, raw OpenHands event JSON logs.
If you've accumulated session history with @razroo/iso-trace, feed it directly:
# Export a session via iso-trace's CLI
npx @razroo/iso-trace export <session-id> --json --out session.json
from state_trace import MemoryEngine
from state_trace.iso_trace_adapter import ingest_iso_trace_session
engine = MemoryEngine(capacity_limit=256.0, namespace="my-repo")
ingest_iso_trace_session(engine, "session.json")
The adapter reads iso-trace's documented Session → Turn → Event[] JSON and converts it to state-trace's agent_log format — typed nodes for files, edits, tests, errors. Months of accumulated harness history become queryable working memory without re-running the agent.
examples/swebench_verified_solve_rate.py scaffolds end-to-end solve-rate measurement: state-trace brief → LLM patch proposal → SWE-bench-Verified prediction JSONL. It does not run the swebench docker harness; that step is documented in the script's header.
python3 examples/swebench_verified_solve_rate.py --limit 5 --model gpt-5.1-mini --dry-run
MemoryEngine(storage_path=...) picks the backend from the file extension:
.db / .sqlite / .sqlite3 — durable SQLite with WAL journal + FTS5 seed index. Recommended for long-running agent harnesses.See ARCHITECTURE.md for the "why networkx + SQLite, not Neo4j" explainer.
engine = MemoryEngine(storage_path="memory.db", namespace="payments-api")
engine.retrieve("why is login broken?") # scoped to payments-api by default
engine.retrieve("...", include_all_namespaces=True) # opt out
Nodes without a namespace remain visible in every view so pre-namespace data is not lost.
from state_trace.adapters import StateTraceLangGraphMemory, StateTraceLlamaIndexMemory
lg_memory = StateTraceLangGraphMemory(default_session="coding-session")
li_memory = StateTraceLlamaIndexMemory(session_id="agent-session")
Neither adapter imports the host framework; they satisfy the duck-typed memory contract used by each.
from state_trace.api import app # POST /store, /retrieve, /retrieve_brief, GET /graph
Pass "explain": true on retrieve to include per-node score breakdowns.
python3 -m pytest -q
Full set of repo-local benchmarks and their honest caveats lives in BENCHMARKS.md. The SWE-bench-Verified row above is the only one that's at a scale worth citing externally.
See vs. Graphiti above for the head-to-head comparison and ARCHITECTURE.md for the architecture tradeoffs in detail. tl;dr: different products, adjacent problems — state-trace owns the narrow coding-agent working-memory lane; Graphiti owns weeks-of-history temporal knowledge graphs.
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.