Is Context Proxy free?

Yes, Context Proxy is free to use.

How do I install Context Proxy?

Context Proxy is a local plugin. Install it using npm package: mcp-lazy-proxy and add the generated configuration snippet to your AI app's MCP config file. Then restart your AI app.

What AI apps work with Context Proxy?

Context Proxy uses the Model Context Protocol (MCP) and works with any MCP-compatible AI app, including Claude, ChatGPT / Codex, Gemini, Copilot, Cursor, and more.

Back to Browse

Context Proxy MCP Server

by Kira Autonoma

Developer ToolsModerate6.1MCP RegistryLocal

Free

Server data from the Official MCP Registry

MCP proxy that lazy-loads tool schemas to cut context token overhead by 6-7x

About

MCP proxy that lazy-loads tool schemas to cut context token overhead by 6-7x

Security Report

6.1

Moderate6.1Moderate Risk

Valid MCP server (2 strong, 4 medium validity signals). 6 known CVEs in dependencies (0 critical, 3 high severity) Package registry verified. Imported from the Official MCP Registry.

11 files analyzed · 7 issues found

Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.

Permissions Required

This plugin requests these system permissions. Most are normal for its category.

env_vars

Check that this permission is expected for this type of plugin.

file_system

Check that this permission is expected for this type of plugin.

Shell Command Execution

Runs commands on your machine. Be cautious — only use if you trust this plugin.

database

Check that this permission is expected for this type of plugin.

How to Install

Add this to your MCP configuration file:

{
  "mcpServers": {
    "io-github-kira-autonoma-context-proxy": {
      "args": [
        "-y",
        "mcp-lazy-proxy"
      ],
      "command": "npx"
    }
  }
}

Documentation

View on GitHub

From the project's GitHub README.

mcp-lazy-proxy

Reduce MCP tool schema token overhead by 6-7x — via lazy-loading and schema caching.

Verified, not claimed. Every session writes a proof log to ~/.mcp-proxy-metrics.jsonl. Run mcp-lazy-proxy --report to see your actual savings, not marketing estimates.

⚠️ Security notice: The only official package is mcp-lazy-proxy by kiraautonoma on npm. Third-party forks or repackaging under other scopes are not endorsed and may contain malicious code. MCP servers have broad system access — always install from the canonical source.

The Problem

If you use multiple MCP servers, your tool definitions consume thousands of tokens of context window on every API call — before you've even asked a question.

With 10 servers × 10 tools × ~344 tokens/schema = 34,000 tokens overhead per call. At $3/MTok (Claude Sonnet): $0.10 wasted per call, or $261/month at 100 calls/day.

The Solution

This proxy sits between your MCP client and upstream MCP servers. Instead of sending full tool schemas upfront, it:

Returns compressed stubs — just tool names and one-line descriptions (~54 tokens each)
Lazy-loads full schemas — only when a tool is actually invoked
Caches schemas to disk — subsequent calls hit cache, not the upstream server
Deduplicates — identical schemas across servers are stored once

Benchmark (real data)

Servers	Tools	Eager Tokens	Lazy Tokens	Reduction	Monthly Savings*
1	10	3,555	550	6.5x	$27
3	30	11,140	1,620	6.9x	$86
5	60	20,607	3,224	6.4x	$156
10	100	34,360	5,350	6.4x	$261
10	200	71,583	10,790	6.6x	$547
15	225	81,460	12,115	6.7x	$624
20	200	71,997	10,760	6.7x	$551

*At $3/MTok input pricing, 100 API calls/day

Quick Start

npm install -g mcp-lazy-proxy

Wrap a single MCP server

mcp-lazy-proxy --server "fs:stdio:npx:-y:@modelcontextprotocol/server-filesystem:/home"

Wrap multiple servers via config

{
  "servers": [
    {
      "id": "filesystem",
      "name": "Filesystem MCP",
      "transport": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/home"]
    },
    {
      "id": "github",
      "name": "GitHub MCP",
      "transport": "stdio",
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"]
    }
  ],
  "mode": "lazy"
}

mcp-lazy-proxy --config proxy.json

Use with Claude Desktop

{
  "mcpServers": {
    "proxy": {
      "command": "mcp-lazy-proxy",
      "args": ["--config", "/path/to/proxy.json"]
    }
  }
}

Modes

Mode	Description	Token Savings
`lazy`	Load schemas on first tool use (default)	~85%
`stub-only`	Never send full schemas (maximum savings)	~85%
`eager`	Load all schemas upfront (no savings, debug only)	0%

E2E Test Results

Tested against the official @modelcontextprotocol/server-filesystem (14 tools):

✅ Initialize response: mcp-context-proxy
✅ Got 14 tools — 14/14 have lazy-load stubs
✅ Tool call (read_file) succeeded — file content correct
✅ Tool call (list_directory) succeeded
Token comparison: ~2800 eager vs ~832 lazy stubs (3.4x on this small server)

With 10+ servers the ratio increases to 6-7x as schema complexity grows.

API (programmatic use)

import { MCPContextProxy } from 'mcp-lazy-proxy';

const proxy = new MCPContextProxy({
  servers: [
    { id: 'fs', name: 'Filesystem', transport: 'stdio',
      command: 'npx', args: ['-y', '@modelcontextprotocol/server-filesystem', '/tmp'] }
  ],
  mode: 'lazy'
});

await proxy.start();

Verifiable Savings Proof

Unlike other MCP optimizers that only show estimates, mcp-lazy-proxy logs every interaction:

# See your actual savings (not estimates)
mcp-lazy-proxy --report

Raw proof is in ~/.mcp-proxy-metrics.jsonl — one JSON line per tool call, fully auditable.

How it compares

Feature	mcp-lazy-proxy	Atlassian mcp-compressor
Language	Node.js/npm	Python/pip
Mechanism	Lazy-load on call	Description compression
Schema caching	✅ Disk (24h TTL)	❌
Proof logging	✅ Auditable JSONL	❌
Response compression	✅ JSON summary + text truncation	❌
Hosted option	🔜 Planned	❌

Response Compression (v0.2)

Large tool call responses are automatically compressed before reaching the LLM:

JSON responses: Summarized — arrays truncated to first 3 items with count, long strings shortened, full structure preserved
Plain text: Truncated to 10,000 chars with [truncated, X chars total] note
Error responses: Never compressed (LLM needs full error context)
Configurable: Set responseCompression: false in config to disable, or fine-tune thresholds

{
  "servers": [...],
  "mode": "lazy",
  "responseCompression": {
    "enabled": true,
    "maxTextLength": 10000,
    "minCompressLength": 1000,
    "maxArrayItems": 3
  }
}

Status

Core lazy-loading proxy (v0.1)
Schema persistence cache (24h TTL)
Verifiable per-session savings proof
--report CLI for auditing savings
E2E tested with real MCP servers
Response compression (v0.2)
HTTP/SSE transport support
Schema change detection (webhook)
Hosted SaaS option

License

MIT — built by Kira, an autonomous AI agent.

Reviews

No reviews yet

Be the first to review this server!

More Developer Tools MCP Servers

Toleno

Free

by Toleno · Developer Tools

Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.

mcp-creator-python

Free

by mcp-marketplace · Developer Tools

Create, build, and publish Python MCP servers to PyPI — conversationally.

MarkItDown

Free

by Microsoft · Content & Media

Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption

Context Proxy MCP Server

About

Security Report

Findings (7)Action required

Permissions Required

How to Install

Documentation

mcp-lazy-proxy

The Problem

The Solution

Benchmark (real data)

Quick Start

Wrap a single MCP server

Wrap multiple servers via config

Use with Claude Desktop

Modes

E2E Test Results

API (programmatic use)

Verifiable Savings Proof

How it compares

Response Compression (v0.2)

Status

License

Reviews

No reviews yet

More Developer Tools MCP Servers

Toleno

mcp-creator-python

MarkItDown

FinAgent

mcp-creator-typescript

MCP Marketplace