Server data from the Official MCP Registry
Semantic search over WSO2 docs (APIM, MI, Choreo, Ballerina) via RAG and pgvector.
Semantic search over WSO2 docs (APIM, MI, Choreo, Ballerina) via RAG and pgvector.
Valid MCP server (2 strong, 3 medium validity signals). 1 code issue detected. 7 known CVEs in dependencies (0 critical, 5 high severity) Package registry verified. Imported from the Official MCP Registry. 1 finding(s) downgraded by scanner intelligence.
5 files analyzed · 9 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
Set these up before or after installing:
Environment variable: DATABASE_URL
Environment variable: EMBEDDING_PROVIDER
Environment variable: OPENAI_API_KEY
Environment variable: GEMINI_API_KEY
Environment variable: VOYAGE_API_KEY
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-iamvirul-wso2-docs-mcp-server": {
"env": {
"DATABASE_URL": "your-database-url-here",
"GEMINI_API_KEY": "your-gemini-api-key-here",
"OPENAI_API_KEY": "your-openai-api-key-here",
"VOYAGE_API_KEY": "your-voyage-api-key-here",
"EMBEDDING_PROVIDER": "your-embedding-provider-here"
},
"args": [
"-y",
"wso2-docs-mcp-server"
],
"command": "npx"
}
}
}From the project's GitHub README.
A production-ready Model Context Protocol (MCP) server that provides AI assistants (Claude Desktop, Claude Code, Cursor, VS Code) with semantic search over WSO2 documentation via Retrieval-Augmented Generation (RAG).
Under the hood, it uses a blazing-fast dual-ingestion engine:
flowchart TD
%% Styling
classDef github fill:#1f2328,color:#fff,stroke:#e1e4e8
classDef default fill:#0969da,color:#fff,stroke:#0969da
classDef db fill:#218bff,color:#fff,stroke:#218bff
classDef web fill:#0969da,color:#fff,stroke:#0969da
subgraph Sources ["Information Sources"]
GH["GitHub Repos\n(wso2/docs-apim, etc.)"]:::github
Web["WSO2 Websites\n(ballerina.io, lib)"]:::web
end
subgraph Ingestion ["Dual-Ingestion Pipeline"]
A["GitHubDocFetcher\n(Git Trees API)"]
B["DocCrawler\n(HTML scraping)"]
C["MarkdownParser\n(Front-matter & Heads)"]
D["DocParser\n(Cheerio HTML parsing)"]
E["DocChunker\n(Semantic Splitting)"]
end
subgraph Embedding ["Vectorization & Storage"]
F["EmbedderFactory\n(Ollama / HuggingFace)"]
G[("pgvector\n(PostgreSQL)")]:::db
end
%% Flow
GH --> A
Web --> B
A -->|"Raw .md"| C
B -->|"Clean HTML"| D
C -->|"ParsedSection[]"| E
D -->|"ParsedSection[]"| E
E -->|"Tokens/Chunks"| F
F -->|"768-dim Vectors"| G
| Product | URL |
|---|---|
| API Manager | https://apim.docs.wso2.com |
| Micro Integrator | https://mi.docs.wso2.com/en/4.4.0 |
| Choreo | https://wso2.com/choreo/docs |
| Ballerina | https://ballerina.io/learn |
| Ballerina Integrator | https://bi.docs.wso2.com |
| WSO2 Library | https://wso2.com/library |
Choose the setup path that fits your use case:
Install the package globally to get the wso2-docs-mcp-server, wso2-docs-crawl, and wso2-docs-migrate commands available system-wide:
npm install -g wso2-docs-mcp-server
Prefer no global install? You can use
npx wso2-docs-mcp-server,npx wso2-docs-crawl, andnpx wso2-docs-migratein every step below — just replace the bare command with itsnpxequivalent.
Download the docker-compose.yml and start the database:
curl -O https://raw.githubusercontent.com/iamvirul/wso2-docs-mcp-server/main/docker-compose.yml
docker compose up -d
Install Ollama and pull the default embedding model:
ollama pull nomic-embed-text
ollama serve
No Ollama? Skip this step. The server automatically falls back to HuggingFace ONNX — model downloads on first use with no extra setup.
DATABASE_URL="postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs" \
wso2-docs-migrate
Run migration again whenever you change
EMBEDDING_DIMENSIONS(i.e. switch embedding provider). The script detects and handles dimension changes automatically.
# Index all products (first run downloads the embedding model automatically)
DATABASE_URL="postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs" \
wso2-docs-crawl
# Index a single product (faster, great for testing)
DATABASE_URL="postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs" \
wso2-docs-crawl --product ballerina --limit 20
# Force re-index even unchanged pages
DATABASE_URL="postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs" \
wso2-docs-crawl --force
Available product IDs: apim, mi, choreo, ballerina, bi, library
The MCP server is launched on demand by your AI client — no background process needed.
Claude Desktop — edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"wso2-docs": {
"command": "wso2-docs-mcp-server",
"env": {
"DATABASE_URL": "postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs",
"EMBEDDING_PROVIDER": "ollama"
}
}
}
}
Claude Code — run once in your terminal:
claude mcp add wso2-docs \
--transport stdio \
-e DATABASE_URL="postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs" \
-e EMBEDDING_PROVIDER="ollama" \
-- wso2-docs-mcp-server
# Verify
claude mcp list
Cursor — create .cursor/mcp.json in your project root:
{
"mcpServers": {
"wso2-docs": {
"command": "wso2-docs-mcp-server",
"env": {
"DATABASE_URL": "postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs",
"EMBEDDING_PROVIDER": "ollama"
}
}
}
}
VS Code — create .vscode/mcp.json:
{
"servers": {
"wso2-docs": {
"type": "stdio",
"command": "wso2-docs-mcp-server",
"env": {
"DATABASE_URL": "postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs",
"EMBEDDING_PROVIDER": "ollama"
}
}
}
}
Using
npxinstead of global install? Replace"command": "wso2-docs-mcp-server"with"command": "npx"and add"args": ["-y", "wso2-docs-mcp-server"].
Cloud embedding provider? Add the key to
env, e.g."EMBEDDING_PROVIDER": "openai", "OPENAI_API_KEY": "sk-...".
git clone https://github.com/iamvirul/wso2-docs-mcp-server.git
cd wso2-docs-mcp-server
npm install
Install Ollama and start it:
ollama serve
No Ollama? Skip this step. The server detects Ollama is not running and automatically falls back to HuggingFace ONNX inference — the model downloads on first use with no extra setup.
cp .env.example .env
# Defaults work out of the box with Ollama.
# Only edit if using a cloud provider (OpenAI / Gemini / Voyage).
docker compose up -d
# pgAdmin available at http://localhost:5050 (admin@wso2mcp.local / admin)
npm run db:migrate
Note: Run migration again whenever you change
EMBEDDING_DIMENSIONS(i.e. switch embedding provider). The script detects and handles dimension changes automatically.
# Index all products
# On first run the embedding model is downloaded automatically (Ollama or HuggingFace)
npm run crawl
# Index a single product (faster, great for testing)
npm run crawl -- --product ballerina --limit 20
# Force re-index even unchanged pages
npm run crawl -- --force
npm run build
npm start
For development (no build step):
npm run dev
Replace
/ABSOLUTE/PATH/TO/wso2-docs-mcp-serverwith your actual clone path.
Claude Desktop — edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"wso2-docs": {
"command": "node",
"args": ["/ABSOLUTE/PATH/TO/wso2-docs-mcp-server/dist/src/index.js"],
"env": {
"DATABASE_URL": "postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs",
"EMBEDDING_PROVIDER": "ollama"
}
}
}
}
Claude Code:
claude mcp add wso2-docs \
--transport stdio \
-e DATABASE_URL="postgresql://wso2mcp:wso2mcp@localhost:5432/wso2docs" \
-e EMBEDDING_PROVIDER="ollama" \
-- node "/ABSOLUTE/PATH/TO/wso2-docs-mcp-server/dist/src/index.js"
# Verify
claude mcp list
See config-examples/claude_code.sh for a convenience script.
Cursor — create .cursor/mcp.json — see config-examples/cursor_mcp.json.
VS Code — create .vscode/mcp.json — see config-examples/vscode_mcp.json.
| Tool | Description |
|---|---|
search_wso2_docs | Semantic search across all products. Optional product and limit filters. |
get_wso2_guide | Search within a specific product (apim, mi, choreo, ballerina, bi, library). |
explain_wso2_concept | Broad concept search across all products, returns 8 top results. |
list_wso2_products | Returns all supported products with IDs and base URLs. |
[
{
"title": "Deploying WSO2 API Manager",
"snippet": "WSO2 API Manager can be deployed in various topologies…",
"source_url": "https://apim.docs.wso2.com/en/latest/install-and-setup/...",
"product": "apim",
"section": "Deployment Patterns",
"score": 0.8712
}
]
The default EMBEDDING_PROVIDER=ollama runs entirely on your machine with no API key. The startup sequence is:
Is Ollama running?
├── Yes → Is model present?
│ ├── Yes → Ready (instant)
│ └── No → Pull via Ollama (streamed, runs once)
└── No → Download ONNX model from HuggingFace Hub (~250 MB, cached after first run)
and run inference in-process via @huggingface/transformers
Both paths use nomic-embed-text / Xenova/nomic-embed-text-v1 by default and produce identical 768-dim vectors, so you can switch between them without re-indexing.
When Ollama is not available, the server auto-detects the best compute backend:
| Machine | Detection | ONNX dtype | Batch size | Throughput |
|---|---|---|---|---|
| Apple Silicon (M1/M2/M3/M4) | process.arch === 'arm64' | q8 INT8 | 32 | ~9 ms/chunk |
| NVIDIA GPU | nvidia-smi probe | fp32 | 64 | GPU-dependent |
| All others | fallback | q8 INT8 | 16 | ~10 ms/chunk |
Why q8 on Apple Silicon instead of CoreML/Metal?
CoreML compiles Metal shaders on first use (~20 min cold-start). For the typical chunk sizes produced by this server (6–20 chunks per page), the CPU↔GPU transfer overhead eliminates any inference gain. INT8 quantized inference on ARM NEON SIMD is consistently ~100× faster than fp32 CPU with zero cold-start cost.
Benchmark (Apple M-chip, Xenova/nomic-embed-text-v1):
fp32 CPU (before): ~1,000 ms/chunk (68 chunks ≈ 68 s of embedding)
q8 ARM NEON: ~9 ms/chunk (68 chunks ≈ 0.6 s of embedding) ← ~100× speedup
Note: For small crawls (≤ 10 pages) total wall-clock time is dominated by network I/O (HTTPS fetches to docs sites), so the end-to-end improvement is modest. The embedding speedup becomes significant at scale — crawling 500+ pages where embedding previously accounted for hours of runtime. For best crawl performance, run Ollama (
ollama serve) which parallelises inference natively and has no per-chunk overhead.
| Variable | Default | Description |
|---|---|---|
DATABASE_URL | — | PostgreSQL connection string (required) |
EMBEDDING_PROVIDER | ollama | ollama | openai | gemini | voyage |
EMBEDDING_DIMENSIONS | 768 | Must match model output dimensions |
CRAWL_CONCURRENCY | 5 | Concurrent HTTP requests during crawl |
CHUNK_SIZE | 800 | Approximate tokens per chunk |
CHUNK_OVERLAP | 100 | Overlap tokens between chunks |
CACHE_TTL_SECONDS | 3600 | In-memory query cache TTL |
TOP_K_RESULTS | 10 | Default search result count |
| Variable | Default | Description |
|---|---|---|
OLLAMA_BASE_URL | http://localhost:11434 | Ollama server URL |
OLLAMA_EMBEDDING_MODEL | nomic-embed-text | Model pulled and used via Ollama |
HUGGINGFACE_EMBEDDING_MODEL | Xenova/nomic-embed-text-v1 | ONNX fallback when Ollama is not running |
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY | — | Required if EMBEDDING_PROVIDER=openai |
OPENAI_EMBEDDING_MODEL | text-embedding-3-small | OpenAI model |
GEMINI_API_KEY | — | Required if EMBEDDING_PROVIDER=gemini |
GEMINI_EMBEDDING_MODEL | text-embedding-004 | Gemini model |
VOYAGE_API_KEY | — | Required if EMBEDDING_PROVIDER=voyage |
VOYAGE_EMBEDDING_MODEL | voyage-3 | Voyage model |
| Provider | Model | Dimensions |
|---|---|---|
| Ollama / HuggingFace | nomic-embed-text / Xenova/nomic-embed-text-v1 | 768 (default) |
| Ollama / HuggingFace | mxbai-embed-large / Xenova/mxbai-embed-large-v1 | 1024 |
| Ollama / HuggingFace | all-minilm / Xenova/all-MiniLM-L6-v2 | 384 |
| OpenAI | text-embedding-3-small | 1536 |
| OpenAI | text-embedding-3-large | 3072 |
| Gemini | text-embedding-004 | 768 |
| Voyage | voyage-3 | 1024 |
| Voyage | voyage-3-lite | 512 |
# Run a one-off re-index (checks hashes, skips unchanged pages)
npm run reindex
# Or from the project directory using node-cron (runs daily at 2 AM)
DATABASE_URL=... node -e "
const { ReindexJob } = require('./dist/jobs/reindexDocs');
const job = new ReindexJob();
job.initialize().then(() => job.scheduleDaily());
"
src/
config/ env.ts · constants.ts
vectorstore/ pgvector.ts · schema.sql
ingestion/ crawler.ts · parser.ts · githubFetcher.ts · markdownParser.ts · chunker.ts · embedder.ts
server/ mcpServer.ts · toolRegistry.ts
jobs/ reindexDocs.ts
index.ts
scripts/
crawl.ts CLI ingestion pipeline
migrate.ts Dynamic schema migration
config-examples/ claude_desktop.json · claude_code.sh · cursor_mcp.json · vscode_mcp.json
docker-compose.yml
.env.example
# Type-check
npx tsc --noEmit
# Run crawl with tsx (no build needed)
npm run crawl -- --product ballerina --limit 5
# Run server in dev mode
npm run dev
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.