Server data from the Official MCP Registry
PDF parsing server with text extraction, metadata, search, images, and TOC via MCP
PDF parsing server with text extraction, metadata, search, images, and TOC via MCP
Valid MCP server (2 strong, 1 medium validity signals). 1 known CVE in dependencies ⚠️ Package registry links to a different repository than scanned source. Imported from the Official MCP Registry. 1 finding(s) downgraded by scanner intelligence.
10 files analyzed · 2 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-libres-coder-parseflow": {
"args": [
"-y",
"parseflow-basic-usage-examples"
],
"command": "npx"
}
}
}From the project's GitHub README.
AI 驱动的全能文档解析库
English | 中文
ParseFlow 是一个全面的文档解析解决方案,支持 PDF、Word、Excel、PowerPoint 和 图片 OCR。它提供独立的核心库和 MCP 服务器,可供 AI 助手使用。
npm install parseflow-core
npm install -g parseflow-mcp-server
# 或使用 npx
npx parseflow-mcp-server
import { PDFParser } from 'parseflow-core';
const parser = new PDFParser();
const text = await parser.extractText('document.pdf');
const results = await parser.search('document.pdf', '关键词');
import { WordParser } from 'parseflow-core';
const parser = new WordParser();
const result = await parser.extractText('report.docx');
const html = await parser.extractHTML('report.docx');
import { ExcelParser } from 'parseflow-core';
const parser = new ExcelParser();
const data = await parser.extractData('spreadsheet.xlsx');
const results = await parser.searchText('data.xlsx', '收入');
import { PowerPointParser } from 'parseflow-core';
const parser = new PowerPointParser();
const result = await parser.extractText('presentation.pptx');
const results = await parser.searchText('slides.pptx', '关键词');
在 claude_desktop_config.json 中添加:
{
"mcpServers": {
"parseflow": {
"command": "npx",
"args": ["-y", "parseflow-mcp-server"]
}
}
}
| 类别 | 工具 | 描述 |
|---|---|---|
extract_text | 提取文本(支持加密 PDF) | |
get_metadata | 获取元数据 | |
search_pdf | 全文搜索 | |
extract_images | 提取图片 | |
get_toc | 获取目录 | |
merge_pdf | 合并多个 PDF | |
split_pdf | 拆分为单页 | |
extract_pdf_pages | 提取指定页码 | |
add_watermark | 添加文字水印 | |
add_image_watermark | 添加图片水印 | |
remove_watermark | 移除水印(覆盖) | |
| Word | extract_word | 提取文本/HTML |
search_word | 文本搜索 | |
| Excel | extract_excel | 提取数据 |
search_excel | 单元格搜索 | |
| PPT | extract_powerpoint | 提取幻灯片 |
search_powerpoint | 幻灯片搜索 | |
| OCR | extract_ocr | 图片文字识别 |
search_ocr | OCR 文本搜索 | |
| AI | semantic_index | 文档向量索引 |
semantic_search | 语义相似搜索 | |
| 批量 | batch_extract | 批量提取多文件 |
batch_search | 批量搜索多文件 |
| 版本 | 功能 |
|---|---|
| v1.8.0 | 💧 PDF 水印(文字/图片水印) |
| v1.7.0 | 📦 批量处理(并行处理多文件) |
| v1.6.0 | 🧠 语义搜索(AI 向量嵌入) |
| v1.5.0 | 📄 PDF 合并/拆分/提取 |
| v1.4.0 | 🔐 加密 PDF 支持 |
| v1.3.0 | 🔍 OCR 图片文字识别 |
| v1.2.0 | 🎯 PowerPoint 支持 |
| v1.1.0 | 📝 Word + 📊 Excel 支持 |
| v1.0.0 | 📄 PDF 基础解析 |
MIT License - 详见 LICENSE
Made with ❤️ by Libres-coder
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.