Server data from the Official MCP Registry
Extract structured JSON from PDFs, images, DOCX, XLSX, CSV, EML attachments, and DSPy pipelines.
Extract structured JSON from PDFs, images, DOCX, XLSX, CSV, EML attachments, and DSPy pipelines.
Valid MCP server (2 strong, 4 medium validity signals). No known CVEs in dependencies. Package registry verified. Imported from the Official MCP Registry.
6 files analyzed · 1 issue found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: CLICHEFACTORY_API_KEY
Environment variable: CLICHEFACTORY_API_URL
Environment variable: LLM_MODEL_NAME
Environment variable: LLM_API_KEY
Environment variable: OCR_MODEL_NAME
Environment variable: OCR_API_KEY
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-clichefactory-clichefactory-mcp": {
"env": {
"LLM_API_KEY": "your-llm-api-key-here",
"OCR_API_KEY": "your-ocr-api-key-here",
"LLM_MODEL_NAME": "your-llm-model-name-here",
"OCR_MODEL_NAME": "your-ocr-model-name-here",
"CLICHEFACTORY_API_KEY": "your-clichefactory-api-key-here",
"CLICHEFACTORY_API_URL": "your-clichefactory-api-url-here"
},
"args": [
"clichefactory-mcp"
],
"command": "uvx"
}
}
}From the project's GitHub README.
MCP (Model Context Protocol) server for ClicheFactory — structured data extraction from documents.
This server exposes ClicheFactory's extraction and document conversion capabilities as MCP tools, allowing AI assistants in Cursor, Claude Desktop, OpenClaw, and other MCP-compatible clients to extract structured data from PDFs, images, DOCX, XLSX, CSV, EML, and more.
Service mode uses the ClicheFactory cloud for the best extraction quality. You only need one API key.
Sign up at clichefactory.com — free pages included, no credit card required.
Create an API key in Settings → API Keys (format: cliche-...).
Install the MCP server:
pip install clichefactory-mcp
Configure — either paste the key into your MCP client (see below) or run once in a terminal:
pip install clichefactory # if you don't have the CLI yet
clichefactory configure
The interactive wizard saves credentials to ~/.clichefactory/config.toml, which the MCP server reads automatically.
That's it — one env var (CLICHEFACTORY_API_KEY) or a config file, and you're on hosted extraction.
| Tool | Description |
|---|---|
extract | Extract structured JSON from a document using a schema |
to_markdown | Convert a document to markdown text |
doctor | Check configuration, dependencies, and system binaries |
extractThe main tool. Pass a document file and a JSON schema — get structured data back.
Supports all extraction modes:
| Mode | Description | Requires |
|---|---|---|
| (default) | OCR + LLM extraction | Service API key (recommended) |
fast | Fastest pipeline | Service API key |
trained | Trained pipeline artifact | Service + artifact_id |
robust | Two-stage extract + verify | Service only |
robust-trained | Trained extract + verification | Service + artifact_id |
The schema can be provided as:
.json schema file{"type": "object", "properties": {"invoice_number": {"type": "string"}, "total": {"type": "number"}}})to_markdownConverts any supported document to markdown. Useful for inspecting document contents or feeding them to the LLM for analysis before deciding on an extraction schema.
doctorRuns diagnostics on the ClicheFactory setup — config file, API keys, Python dependencies, system binaries. Call this when things aren't working.
The server defaults to service mode (ClicheFactory cloud). Local mode is available for BYOK / air-gapped use.
service (recommended) — Uses the ClicheFactory cloud service. Requires a ClicheFactory API key. Supports all extraction modes including trained pipelines and robust verification. Best extraction quality out of the box.
local (advanced) — Runs extraction on your machine. You bring your own LLM key (BYOK). Requires pip install "clichefactory-mcp[local]" (~2 GB of parsing/OCR dependencies) plus system binaries (tesseract, LibreOffice). Quality depends on your local setup.
pip install clichefactory-mcp
For local-mode extraction (BYOK, runs on your machine), install with the local extras:
pip install "clichefactory-mcp[local]"
Set these in your MCP client configuration (see below) or in ~/.clichefactory/config.toml via clichefactory configure.
| Variable | Required | Description |
|---|---|---|
CLICHEFACTORY_API_KEY | Yes (service mode) | ClicheFactory API key from Settings → API Keys (cliche-...) |
CLICHEFACTORY_API_URL | No | Override the default service URL (https://api.clichefactory.com); useful for local development against a self-hosted ClicheFactory backend |
LLM_MODEL_NAME | Local mode only | Model name, e.g. gemini/gemini-3-flash-preview |
LLM_API_KEY | Local mode only | API key for the LLM provider |
OCR_MODEL_NAME | No | Separate OCR/VLM model (defaults to main model) |
OCR_API_KEY | No | API key for OCR model (defaults to main key) |
Environment variables take precedence over the config file at ~/.clichefactory/config.toml.
Add to .cursor/mcp.json in your project (or global Cursor settings):
{
"mcpServers": {
"clichefactory": {
"command": "uvx",
"args": ["clichefactory-mcp"],
"env": {
"CLICHEFACTORY_API_KEY": "cliche-your-key-here"
}
}
}
}
For local development from a git checkout, replace uvx with:
"command": "uv",
"args": ["--directory", "/absolute/path/to/cliche-mcp", "run", "clichefactory-mcp"]
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"clichefactory": {
"command": "uvx",
"args": ["clichefactory-mcp"],
"env": {
"CLICHEFACTORY_API_KEY": "cliche-your-key-here"
}
}
}
}
Register the MCP server with your OpenClaw agent:
openclaw mcp set clichefactory '{"command":"uvx","args":["clichefactory-mcp"],"env":{"CLICHEFACTORY_API_KEY":"cliche-your-key-here"}}'
Verify with openclaw mcp list. The agent can now use extract, to_markdown, and doctor tools in any conversation.
An OpenClaw skill with agent instructions is also available in integrations/openclaw/. To install it into your workspace:
cp -r /path/to/cliche-mcp/integrations/openclaw ~/.openclaw/skills/clichefactory
Or, once published to ClawHub:
openclaw skills install clichefactory
If you prefer BYOK extraction on your machine, install the local extras and set LLM credentials:
{
"mcpServers": {
"clichefactory": {
"command": "uvx",
"args": ["clichefactory-mcp"],
"env": {
"LLM_MODEL_NAME": "gemini/gemini-3-flash-preview",
"LLM_API_KEY": "your-gemini-api-key"
}
}
}
}
Pass mode="local" explicitly in tool calls, or run clichefactory configure --local to set local as the default in ~/.clichefactory/config.toml.
PDF, PNG, JPG, JPEG, WebP, GIF, BMP, DOCX, DOC, ODT, XLSX, CSV, EML, TXT, MD.
This MCP server covers the core extraction and conversion workflows. The following CLI features are not included in v1:
| Feature | Reason |
|---|---|
Batch operations (extract-batch, to-markdown-batch) | MCP tools are typically called one-at-a-time by the LLM. For multiple documents, the LLM calls extract in sequence. Batch support may be added in a future version. |
configure | Interactive prompts don't work in MCP. Use env vars or run clichefactory configure in a terminal. |
--output / -o flag | MCP tools return results directly to the LLM rather than writing to files. |
allow_partial | Not exposed as a tool parameter in v1. |
| OCR engine selection | Uses the SDK defaults (RapidOCR). Configure via ~/.clichefactory/config.toml or pass parsing options through the SDK if needed. |
# Install in development mode
uv sync
# Run the server directly (stdio transport, for testing with MCP clients)
uv run clichefactory-mcp
# Inspect available tools (requires mcp CLI)
uv run mcp dev cliche_mcp/server.py
MIT — Copyright (c) 2026 Urban Susnik s.p.
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Web content fetching and conversion for efficient LLM usage
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.