Server data from the Official MCP Registry
MCP proxy that lazy-loads tool schemas to cut context token overhead by 6-7x
MCP proxy that lazy-loads tool schemas to cut context token overhead by 6-7x
Valid MCP server (2 strong, 4 medium validity signals). 6 known CVEs in dependencies (0 critical, 3 high severity) Package registry verified. Imported from the Official MCP Registry.
11 files analyzed · 7 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-kira-autonoma-context-proxy": {
"args": [
"-y",
"mcp-lazy-proxy"
],
"command": "npx"
}
}
}From the project's GitHub README.
Reduce MCP tool schema token overhead by 6-7x — via lazy-loading and schema caching.
Verified, not claimed. Every session writes a proof log to
~/.mcp-proxy-metrics.jsonl. Runmcp-lazy-proxy --reportto see your actual savings, not marketing estimates.
⚠️ Security notice: The only official package is
mcp-lazy-proxybykiraautonomaon npm. Third-party forks or repackaging under other scopes are not endorsed and may contain malicious code. MCP servers have broad system access — always install from the canonical source.
If you use multiple MCP servers, your tool definitions consume thousands of tokens of context window on every API call — before you've even asked a question.
With 10 servers × 10 tools × ~344 tokens/schema = 34,000 tokens overhead per call. At $3/MTok (Claude Sonnet): $0.10 wasted per call, or $261/month at 100 calls/day.
This proxy sits between your MCP client and upstream MCP servers. Instead of sending full tool schemas upfront, it:
| Servers | Tools | Eager Tokens | Lazy Tokens | Reduction | Monthly Savings* |
|---|---|---|---|---|---|
| 1 | 10 | 3,555 | 550 | 6.5x | $27 |
| 3 | 30 | 11,140 | 1,620 | 6.9x | $86 |
| 5 | 60 | 20,607 | 3,224 | 6.4x | $156 |
| 10 | 100 | 34,360 | 5,350 | 6.4x | $261 |
| 10 | 200 | 71,583 | 10,790 | 6.6x | $547 |
| 15 | 225 | 81,460 | 12,115 | 6.7x | $624 |
| 20 | 200 | 71,997 | 10,760 | 6.7x | $551 |
*At $3/MTok input pricing, 100 API calls/day
npm install -g mcp-lazy-proxy
mcp-lazy-proxy --server "fs:stdio:npx:-y:@modelcontextprotocol/server-filesystem:/home"
{
"servers": [
{
"id": "filesystem",
"name": "Filesystem MCP",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-filesystem", "/home"]
},
{
"id": "github",
"name": "GitHub MCP",
"transport": "stdio",
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"]
}
],
"mode": "lazy"
}
mcp-lazy-proxy --config proxy.json
{
"mcpServers": {
"proxy": {
"command": "mcp-lazy-proxy",
"args": ["--config", "/path/to/proxy.json"]
}
}
}
| Mode | Description | Token Savings |
|---|---|---|
lazy | Load schemas on first tool use (default) | ~85% |
stub-only | Never send full schemas (maximum savings) | ~85% |
eager | Load all schemas upfront (no savings, debug only) | 0% |
Tested against the official @modelcontextprotocol/server-filesystem (14 tools):
✅ Initialize response: mcp-context-proxy
✅ Got 14 tools — 14/14 have lazy-load stubs
✅ Tool call (read_file) succeeded — file content correct
✅ Tool call (list_directory) succeeded
Token comparison: ~2800 eager vs ~832 lazy stubs (3.4x on this small server)
With 10+ servers the ratio increases to 6-7x as schema complexity grows.
import { MCPContextProxy } from 'mcp-lazy-proxy';
const proxy = new MCPContextProxy({
servers: [
{ id: 'fs', name: 'Filesystem', transport: 'stdio',
command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '/tmp'] }
],
mode: 'lazy'
});
await proxy.start();
Unlike other MCP optimizers that only show estimates, mcp-lazy-proxy logs every interaction:
# See your actual savings (not estimates)
mcp-lazy-proxy --report
Raw proof is in ~/.mcp-proxy-metrics.jsonl — one JSON line per tool call, fully auditable.
| Feature | mcp-lazy-proxy | Atlassian mcp-compressor |
|---|---|---|
| Language | Node.js/npm | Python/pip |
| Mechanism | Lazy-load on call | Description compression |
| Schema caching | ✅ Disk (24h TTL) | ❌ |
| Proof logging | ✅ Auditable JSONL | ❌ |
| Response compression | ✅ JSON summary + text truncation | ❌ |
| Hosted option | 🔜 Planned | ❌ |
Large tool call responses are automatically compressed before reaching the LLM:
[truncated, X chars total] noteresponseCompression: false in config to disable, or fine-tune thresholds{
"servers": [...],
"mode": "lazy",
"responseCompression": {
"enabled": true,
"maxTextLength": 10000,
"minCompressLength": 1000,
"maxArrayItems": 3
}
}
--report CLI for auditing savingsMIT — built by Kira, an autonomous AI agent.
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.