Scholarly-sources MCP: papers, patents, books, standards — search and cross-reference prior art.
Scholarly-sources MCP: papers, patents, books, standards — search and cross-reference prior art.
Scholar-MCP is a well-structured educational research server with appropriate security controls for its stated purpose. Authentication via optional API keys is properly handled through environment variables. The codebase demonstrates good practices with comprehensive test coverage, input validation, and proper error handling. Minor code quality observations around broad exception handling and logging do not materially impact security posture given the server's legitimate permissions for network access and file operations. Supply chain analysis found 2 known vulnerabilities in dependencies (1 critical, 1 high severity). Package verification found 1 issue.
3 files analyzed · 8 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: SCHOLAR_MCP_READ_ONLY
Environment variable: SCHOLAR_MCP_BEARER_TOKEN
Environment variable: SCHOLAR_MCP_BASE_URL
Environment variable: SCHOLAR_MCP_OIDC_CONFIG_URL
Environment variable: SCHOLAR_MCP_OIDC_CLIENT_ID
Environment variable: SCHOLAR_MCP_OIDC_CLIENT_SECRET
Environment variable: SCHOLAR_MCP_OIDC_JWT_SIGNING_KEY
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-pvliesdonk-scholar-mcp": {
"env": {
"SCHOLAR_MCP_BASE_URL": "your-scholar-mcp-base-url-here",
"SCHOLAR_MCP_READ_ONLY": "your-scholar-mcp-read-only-here",
"SCHOLAR_MCP_BEARER_TOKEN": "your-scholar-mcp-bearer-token-here",
"SCHOLAR_MCP_OIDC_CLIENT_ID": "your-scholar-mcp-oidc-client-id-here",
"SCHOLAR_MCP_OIDC_CONFIG_URL": "your-scholar-mcp-oidc-config-url-here",
"SCHOLAR_MCP_OIDC_CLIENT_SECRET": "your-scholar-mcp-oidc-client-secret-here",
"SCHOLAR_MCP_OIDC_JWT_SIGNING_KEY": "your-scholar-mcp-oidc-jwt-signing-key-here"
},
"args": [
"pvliesdonk-scholar-mcp"
],
"command": "uvx"
}
}
}From the project's GitHub README.
A FastMCP server for the scholarly citation landscape -- papers, patents, books, and standards -- giving LLMs a unified way to search, cross-reference, and retrieve prior art across all four source types via Semantic Scholar, EPO Open Patent Services, Open Library, and standards bodies (NIST, IETF, W3C, ETSI), with OpenAlex enrichment and optional docling-serve PDF/full-text conversion.
externalIds are automatically enriched with publisher, edition, cover URL, and subject data from Open Library.sync-standards. ISO, IEC, IEEE have a live-fetch fallback for unsynced identifiers; CC and CEN have no live API and require a sync first. Citations matching standards patterns (RFC, ISO, NIST SP, IEEE, EN, CC) are automatically enriched with structured standard_metadata including identifier, title, body, status, and full-text URL when available (see docs/guides/standards.md)..deb and .rpm packages with systemd service and security hardening.Per-domain depth is uneven. Papers currently have the richest tool surface (citation graph, recommendations, cross-referencing to all three other domains); standards are the leanest. That reflects public data availability, not a value hierarchy — writing a paper typically needs all four source types for citations and prior art. Parity work is tracked in GitHub issues and milestones; the roadmap shows intent, not a completeness commitment.
/plugin marketplace add pvliesdonk/claude-plugins
/plugin install scholar-mcp@pvliesdonk
Download scholar-mcp-<VERSION>.mcpb from the latest release and open it in Claude Desktop, or install via the Claude Desktop MCP gallery.
uvx (recommended)uvx --from pvliesdonk-scholar-mcp scholar-mcp serve
pippip install 'pvliesdonk-scholar-mcp[mcp]'
scholar-mcp serve
docker run -v scholar-mcp-data:/data/scholar-mcp \
ghcr.io/pvliesdonk/scholar-mcp:latest
Download .deb or .rpm from the latest release:
# Debian/Ubuntu
sudo dpkg -i scholar-mcp_*.deb
# RHEL/Fedora
sudo rpm -i scholar-mcp-*.rpm
Note: The PyPI package is
pvliesdonk-scholar-mcp. The CLI command installed isscholar-mcp.
uvx --from pvliesdonk-scholar-mcp scholar-mcp serve
API key optional but recommended: The server works without a Semantic Scholar API key, but unauthenticated requests are limited to ~1 req/s and will hit 429 throttles quickly during multi-step operations like citation graph traversal. Request a free key to get ~10 req/s.
Claude Desktop configuration (claude_desktop_config.json):
{
"mcpServers": {
"scholar": {
"command": "uvx",
"args": ["--from", "pvliesdonk-scholar-mcp", "scholar-mcp", "serve"],
"env": {
"SCHOLAR_MCP_S2_API_KEY": "your-key"
}
}
}
}
uvx --from pvliesdonk-scholar-mcp scholar-mcp serve --transport http --port 8000
Tier 2 bodies (ISO, IEC, IEEE, CC, CEN) are populated from community-curated bulk dumps rather than live-scraped at MCP-server runtime. Run the sync on first install and periodically thereafter:
scholar-mcp sync-standards # all registered bodies
scholar-mcp sync-standards --body ISO # only ISO
scholar-mcp sync-standards --body IEEE # only IEEE
scholar-mcp sync-standards --body CC # only Common Criteria
scholar-mcp sync-standards --body CEN # only CEN/CENELEC
scholar-mcp sync-standards --force # re-sync even if upstream SHA is unchanged
Schedule via cron / launchd / systemd timer — weekly is sufficient; standards change slowly. First sync can take several minutes; subsequent runs that find no upstream changes exit within seconds.
All settings are controlled via environment variables with the SCHOLAR_MCP_ prefix.
| Variable | Default | Description |
|---|---|---|
SCHOLAR_MCP_S2_API_KEY | -- | Semantic Scholar API key (request one); optional but recommended for higher rate limits |
SCHOLAR_MCP_READ_ONLY | true | If true, write-tagged tools (fetch_paper_pdf, convert_pdf_to_markdown, fetch_and_convert, fetch_pdf_by_url, fetch_patent_pdf) are hidden |
SCHOLAR_MCP_CACHE_DIR | /data/scholar-mcp | Directory for the SQLite cache database and downloaded PDFs |
SCHOLAR_MCP_CONTACT_EMAIL | -- | Included in the OpenAlex User-Agent for polite pool access (faster rate limits); also enables Unpaywall PDF lookups |
FASTMCP_LOG_LEVEL | INFO | Logging level (DEBUG, INFO, WARNING, ERROR). Controls all output (app + middleware). The -v CLI flag sets this to DEBUG. |
FASTMCP_ENABLE_RICH_LOGGING | true | Set false for plain/JSON-structured log output (e.g. for log aggregators) |
| Variable | Default | Description |
|---|---|---|
SCHOLAR_MCP_DOCLING_URL | -- | Base URL of a running docling-serve instance (e.g. http://localhost:5001) |
SCHOLAR_MCP_VLM_API_URL | -- | OpenAI-compatible VLM endpoint for formula/figure-enriched PDF conversion |
SCHOLAR_MCP_VLM_API_KEY | -- | API key for the VLM endpoint |
SCHOLAR_MCP_VLM_MODEL | gpt-4o | Model name for VLM-enriched conversion |
| Variable | Default | Description |
|---|---|---|
SCHOLAR_MCP_EPO_CONSUMER_KEY | -- | EPO OPS consumer key (register at developers.epo.org); both key and secret must be set for patent tools to appear |
SCHOLAR_MCP_EPO_CONSUMER_SECRET | -- | EPO OPS consumer secret |
| Variable | Default | Description |
|---|---|---|
SCHOLAR_MCP_GOOGLE_BOOKS_API_KEY | -- | Google Books API key for higher rate limits (1000 req/day without key) |
| Variable | Default | Description |
|---|---|---|
SCHOLAR_GITHUB_TOKEN | -- | GitHub personal access token for Relaton sync; lifts unauthenticated GitHub rate limit from 60/hr to 5,000/hr (no scopes required for public-repo reads). Useful for repeated --force testing; daily cron is fine unauthenticated. |
| Variable | Default | Description |
|---|---|---|
SCHOLAR_MCP_BEARER_TOKEN | -- | Static bearer token for HTTP transport authentication |
SCHOLAR_MCP_BASE_URL | -- | Public base URL, required for OIDC (e.g. https://mcp.example.com) |
SCHOLAR_MCP_OIDC_CONFIG_URL | -- | OIDC discovery endpoint URL |
SCHOLAR_MCP_OIDC_CLIENT_ID | -- | OIDC client ID |
SCHOLAR_MCP_OIDC_CLIENT_SECRET | -- | OIDC client secret |
SCHOLAR_MCP_OIDC_JWT_SIGNING_KEY | -- | JWT signing key; required on Linux/Docker to survive restarts (openssl rand -hex 32) |
28 tools, organised by scholarly source type.
| Tool | Description |
|---|---|
search_papers | Full-text search with year, venue, field-of-study, and citation-count filters. Returns up to 100 results with pagination. |
get_paper | Fetch full metadata for a single paper by DOI, S2 ID, arXiv ID, ACM ID, or PubMed ID. |
get_author | Fetch author profile with publications, or search by name. |
| Tool | Description |
|---|---|
get_citations | Forward citations (papers that cite a given paper) with optional filters. |
get_references | Backward references (papers cited by a given paper). |
get_citation_graph | BFS traversal from seed papers, returning nodes + edges up to configurable depth. |
find_bridge_papers | Shortest citation path between two papers. |
| Tool | Description |
|---|---|
recommend_papers | Paper recommendations from 1--5 positive examples and optional negative examples. |
generate_citations | Generate BibTeX, CSL-JSON, or RIS citations for up to 100 papers, with automatic entry type inference and optional OpenAlex venue enrichment. |
enrich_paper | Augment Semantic Scholar metadata with OpenAlex fields (affiliations, funders, OA status, concepts). |
| Tool | Description |
|---|---|
search_patents | Search patents across 100+ patent offices via EPO OPS with CPC / applicant / inventor / jurisdiction / date filters. |
get_patent | Fetch bibliographic / claims / description / family / legal / citations sections for a single patent by publication number. Citations include NPL-to-paper resolution via Semantic Scholar. |
get_citing_patents | Find patents that cite a given academic paper (best-effort; EPO OPS citation search coverage is incomplete). |
fetch_patent_pdf | Download a patent PDF via authenticated EPO OPS and optionally convert to Markdown. |
Patent tools are hidden when
SCHOLAR_MCP_EPO_CONSUMER_KEYandSCHOLAR_MCP_EPO_CONSUMER_SECRETare not set.fetch_patent_pdfis also write-tagged and hidden whenSCHOLAR_MCP_READ_ONLY=true.
| Tool | Description |
|---|---|
search_books | Search for books by title, author, ISBN, or keywords via Open Library. Returns up to 50 results. |
get_book | Fetch book metadata by ISBN-10, ISBN-13, Open Library work ID, or edition ID. Optionally download and cache the cover image locally. |
get_book_excerpt | Fetch a book excerpt and description from Google Books by ISBN. Shows preview availability and link. |
recommend_books | Recommend books for a subject via Open Library, sorted by popularity. |
Papers with an ISBN in their
externalIdsare automatically enriched withbook_metadata(publisher, edition, cover URL, subjects, and more) from Open Library when fetched viaget_paper,get_citations,get_references, orget_citation_graph. Book records also includeworldcat_url(when ISBN-13 is present),google_books_url, andsnippetfrom Google Books enrichment. Cover images can be downloaded and cached locally viaget_book.
| Tool | Description |
|---|---|
resolve_standard_identifier | Normalise a messy citation string (e.g. "rfc9000", "nist 800-53") to canonical form and body. |
search_standards | Search standards by identifier, title, or free text, optionally filtered to one body (NIST, IETF, W3C, ETSI). |
get_standard | Retrieve a standard by canonical or fuzzy identifier, optionally fetching and converting the full text via docling. |
Tier-1 bodies (NIST, IETF, W3C, ETSI) are supported with full metadata and optional full-text conversion. Tier-2 bodies (ISO, IEC, IEEE, CC, CEN/CENELEC) are populated locally via
scholar-mcp sync-standards.
| Tool | Description |
|---|---|
batch_resolve | Resolve up to 100 mixed identifiers (paper DOIs, patent numbers, ISBNs) to full metadata in one call, routing each to the right backend with OpenAlex fallback. |
| Tool | Description |
|---|---|
fetch_paper_pdf | Download PDF for a paper (S2 open-access, then ArXiv/PMC/Unpaywall fallback). |
convert_pdf_to_markdown | Convert a local PDF to Markdown via docling-serve. |
fetch_and_convert | Full pipeline: fetch PDF (with fallback), convert to Markdown, return both. |
fetch_pdf_by_url | Download a PDF from any URL and optionally convert to Markdown. |
PDF tools are write-tagged and hidden when
SCHOLAR_MCP_READ_ONLY=true(the default).fetch_patent_pdf(above) and theget_standardfull-text mode cover the patent and standards equivalents.
| Tool | Description |
|---|---|
get_task_result | Poll for the result of a background task by ID. |
list_tasks | List all active background tasks. |
Long-running operations (PDF download/conversion) and rate-limited backend requests return
{"queued": true, "task_id": "..."}immediately. Useget_task_resultto poll for the result.
services:
scholar-mcp:
image: ghcr.io/pvliesdonk/scholar-mcp:latest
restart: unless-stopped
environment:
SCHOLAR_MCP_S2_API_KEY: "${SCHOLAR_MCP_S2_API_KEY}"
SCHOLAR_MCP_DOCLING_URL: "http://docling-serve:5001"
SCHOLAR_MCP_VLM_API_URL: "${VLM_API_URL:-}"
SCHOLAR_MCP_VLM_API_KEY: "${VLM_API_KEY:-}"
SCHOLAR_MCP_CACHE_DIR: "/data/scholar-mcp"
SCHOLAR_MCP_READ_ONLY: "false"
volumes:
- scholar-mcp-data:/data/scholar-mcp
labels:
- "traefik.enable=true"
- "traefik.http.routers.scholar-mcp.rule=Host(`scholar-mcp.yourdomain.com`)"
docling-serve:
image: ghcr.io/ds4sd/docling-serve:latest
restart: unless-stopped
volumes:
scholar-mcp-data:
# Show cache statistics (row counts, database size)
scholar-mcp cache stats
# Clear all cached data (preserves identifier aliases)
scholar-mcp cache clear
# Remove entries older than 30 days
scholar-mcp cache clear --older-than 30
# Override cache directory
scholar-mcp cache stats --cache-dir /path/to/cache
# Install with dev and MCP dependencies
uv sync --extra dev --extra mcp
# Run tests
uv run pytest
# Lint and format
uv run ruff check src/ tests/
uv run ruff format src/ tests/
# Type check
uv run mypy src/
# Build docs locally
uv sync --extra docs
uv run mkdocs serve
MIT
Be the first to review this server!
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.
by Taylorwilsdon · Productivity
Control Gmail, Calendar, Docs, Sheets, Drive, and more from your AI