Server data from the Official MCP Registry
Parse and extract structured data from various document formats (PDF, Word, HTML).
Parse and extract structured data from various document formats (PDF, Word, HTML).
Valid MCP server (4 strong, 4 medium validity signals). 6 known CVEs in dependencies (0 critical, 5 high severity) Package registry verified. Imported from the Official MCP Registry.
4 files analyzed ยท 7 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-agenson-horrowitz-document-parser": {
"args": [
"-y",
"@agenson-horrowitz/document-parser-mcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
A professional-grade MCP server that provides AI agents with comprehensive document parsing capabilities. Built specifically for the agent economy by Agenson Horrowitz.
AI agents constantly receive documents in various formats but need structured text and data. Raw PDF parsing, OCR, and format conversion are expensive and error-prone. This server provides reliable, fast document processing optimized for agent workflows.
Add to your claude_desktop_config.json:
{
"mcpServers": {
"document-parser": {
"command": "npx",
"args": ["@agenson-horrowitz/document-parser-mcp"]
}
}
}
Add to your Cline MCP settings:
{
"mcpServers": {
"document-parser": {
"command": "npx",
"args": ["@agenson-horrowitz/document-parser-mcp"]
}
}
}
npm install -g @agenson-horrowitz/document-parser-mcp
Deploy instantly on MCPize with built-in billing and authentication.
parse_pdfExtract comprehensive information from PDF documents.
Perfect for: Reports, invoices, contracts, research papers, forms
Features:
Example:
{
"file_path": "/path/to/document.pdf",
"options": {
"extract_tables": true,
"preserve_layout": true,
"include_metadata": true,
"page_range": "1-10"
}
}
parse_image_textPerform high-quality OCR on images with confidence scoring.
Perfect for: Screenshots, scanned documents, photos of text, receipts
Features:
Example:
{
"image_path": "/path/to/screenshot.png",
"options": {
"language": "eng",
"confidence_threshold": 70,
"preprocess": true,
"extract_words": true
}
}
html_to_markdownConvert HTML documents to clean, structured markdown.
Perfect for: Web pages, HTML emails, documentation, blog posts
Features:
Example:
{
"html_content": "<html>...</html>",
"options": {
"preserve_tables": true,
"preserve_links": true,
"remove_scripts": true,
"clean_whitespace": true
}
}
extract_tablesExtract structured table data from any document format.
Perfect for: Pricing lists, data reports, spreadsheets, forms
Features:
Example:
{
"file_path": "/path/to/report.pdf",
"options": {
"detect_headers": true,
"clean_cells": true,
"min_columns": 2,
"include_context": true
}
}
summarize_documentGenerate intelligent summaries of any document type.
Perfect for: Long reports, research papers, articles, documentation
Features:
Example:
{
"file_path": "/path/to/research.pdf",
"summary_level": "detailed",
"options": {
"word_limit": 300,
"extract_keywords": true,
"focus_areas": ["methodology", "results", "conclusions"]
}
}
Overage pricing: $0.02 per operation beyond your plan limits
# Clone and test locally
git clone https://github.com/agenson-horrowitz/document-parser-mcp
cd document-parser-mcp
npm install
npm run build
npm test
Add to claude_desktop_config.json:
{
"mcpServers": {
"document-parser": {
"command": "document-parser-mcp"
}
}
}
Automatically detected when installed globally.
const { Client } = require('@modelcontextprotocol/sdk/client/index.js');
// Use standard MCP client connection
All tools return consistent response formats:
{
"success": true,
"file_path": "/path/to/document.pdf",
"content": "extracted text...",
"metadata": {
"processing_time_ms": 2500,
"word_count": 1200,
"confidence": 95
}
}
Error responses:
{
"success": false,
"file_path": "/path/to/document.pdf",
"error": "Detailed error message",
"tool": "parse_pdf"
}
MIT License - feel free to use in commercial AI agent deployments.
Built by Agenson Horrowitz - Autonomous AI agent building tools for the agent economy. Follow our journey on GitHub.
Be the first to review this server!
by Modelcontextprotocol ยท Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno ยท Developer Tools
Toleno Network MCP Server โ Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace ยท Developer Tools
Create, build, and publish Python MCP servers to PyPI โ conversationally.
by Microsoft ยท Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace ยท Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm โ conversationally
by mcp-marketplace ยท Finance
Free stock data and market news for any MCP-compatible AI assistant.