Server data from the Official MCP Registry
MCP server for academic papers, citations, authors, ECOS dossiers, and Federal Register text.
MCP server for academic papers, citations, authors, ECOS dossiers, and Federal Register text.
This is a well-structured academic research MCP server with proper environment-based credential handling and appropriate permissions for its purpose. The server requires API keys for external services stored as environment variables and has good input validation patterns, though there are some minor areas for improvement in error handling and validation. Supply chain analysis found 3 known vulnerabilities in dependencies (1 critical, 1 high severity).
3 files analyzed · 5 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-joshuasundance-swca-paper-chaser-mcp": {
"args": [
"paper-chaser-mcp"
],
"command": "uvx"
}
}
}From the project's GitHub README.
Release status: The repository, CLI, Docker image metadata, and public MCP identity are now aligned on
paper-chaser-mcp. GHCR images and GitHub Release assets are the primary public distribution channels; PyPI remains intentionally gated until account recovery and trusted-publisher setup are complete.
An MCP server for academic research — search papers, chase citations, look up authors, repair broken references, explore species dossiers, and retrieve regulatory text, all from one FastMCP server that AI assistants can call directly.
Providers: Semantic Scholar · arXiv · OpenAlex · CORE · SerpApi Google Scholar (opt-in, paid) · Crossref · Unpaywall · ECOS · FederalRegister.gov · GovInfo
Paper Chaser MCP is now guided-first: the default public surface is designed to be hard to misuse and explicit about trust.
research handles discovery, known-item
recovery, citation repair, and regulatory routing in one trust-graded path,
with a server-owned quality-first policy for guided use.follow_up_research answers against one saved
searchSessionId; if you omit it, the server only infers a session when the
choice is unique. Saved-session follow-up can classify mixed source sets into
on-topic evidence, weaker context, and off-target leads when the stored
metadata is already sufficient.executionProvenance, and
ambiguous follow-up or source-inspection flows return structured
sessionResolution / sourceResolution payloads instead of opaque errors.resolve_reference handles DOI/arXiv/URL,
citation fragments, and regulatory-style references, and exact DOI/arXiv/
paper-URL inputs resolve as exact anchors rather than falling through to fuzzy repair.
Ambiguous title-only or conflicting metadata matches can now return
multiple_candidates or needs_disambiguation; treat those as candidate
anchors, not citation-ready resolutions.research leads with a short
recommendation-first summary, while keeping the structured evidence,
leads, and provenance fields available below it.inspect_source exposes one sourceId with
provenance, trust state, weak-match rationale, and quality-aware direct-read next steps; omitted searchSessionId
is only accepted when one compatible saved session exists.get_runtime_status surfaces active profile/transport and
provider-state warnings without requiring low-level diagnostics. configuredSmartProvider
is the configured smart bundle; activeSmartProvider is the latest effective execution path, cold-start snapshots emit an explicit provisional warning instead of claiming deterministic fallback before the first smart call settles, and the top-level provider sets now split disabledProviderSet, suppressedProviderSet, degradedProviderSet, and quotaLimitedProviderSet instead of collapsing them.Use PAPER_CHASER_TOOL_PROFILE to choose the advertised surface:
| Profile | Default | Exposed surface | Intended user |
|---|---|---|---|
guided | yes | research, follow_up_research, resolve_reference, inspect_source, get_runtime_status | Low-context users and agents |
expert | no | Guided tools plus raw/provider-specific families (search_papers*, smart graph tools, regulatory direct tools, full diagnostics), subject to enabled features and disabled-tool visibility | Power users and operator workflows |
Practical default: PAPER_CHASER_TOOL_PROFILE=guided with
PAPER_CHASER_HIDE_DISABLED_TOOLS=true.
If you want the fastest local path, install from source and add the server to your MCP client in stdio mode:
pip install -e .
{
"mcpServers": {
"paper-chaser": {
"command": "python",
"args": ["-m", "paper_chaser_mcp"],
"env": {
"PAPER_CHASER_TOOL_PROFILE": "guided",
"PAPER_CHASER_HIDE_DISABLED_TOOLS": "true",
"PAPER_CHASER_ENABLE_SEMANTIC_SCHOLAR": "true",
"PAPER_CHASER_ENABLE_ARXIV": "true",
"PAPER_CHASER_ENABLE_CORE": "false"
}
}
}
}
Then start with one of these prompts in your MCP client:
Research retrieval-augmented generation for coding agents and return only trustworthy findings.Use my last searchSessionId to answer one grounded follow-up question.Resolve this citation fragment: Vaswani et al. 2017 Attention Is All You Need.Research the regulatory history of California condor under 50 CFR 17.95.If you want a local env template for shell runs or Docker Compose, copy .env.example to .env and fill in only the providers you use.
| Goal | Start here |
|---|---|
| Discovery, literature review, or regulatory history | research |
| Grounded follow-up over saved results | follow_up_research |
| Citation/DOI/arXiv/URL/reference cleanup | resolve_reference |
| Audit one returned source before relying on it | inspect_source |
| Explain environment/runtime differences | get_runtime_status |
| Need direct provider control or specialized pagination | switch to expert profile and use raw/provider-specific tools |
research(query="retrieval-augmented generation for coding agents", limit=5)
→ inspect resultStatus, answerability, summary, evidence, leads, routingSummary
→ if resultStatus=needs_disambiguation with clarification.reason=underspecified_reference_fragment:
tighten the anchor or pivot to resolve_reference instead of forcing retrieval
→ save searchSessionId for follow-up or source inspection
follow_up_research(searchSessionId="...", question="What evaluation tradeoffs show up here?")
→ inspect answerStatus
→ if answered: use answer + evidence (compact default: sources are identified by selectedEvidenceIds)
→ if abstained/insufficient_evidence: use nextActions and inspect_source
→ mixed saved sessions can still answer relevance-triage questions such as which items are on-topic vs off-target
→ uniquely anchored recommendation asks can also return a safe start-here answer plus topRecommendation
→ if you omit searchSessionId and multiple saved sessions exist: provide it explicitly
→ for full source records pass responseMode="standard"; for diagnostics responseMode="debug"
→ for selection asks ("where should I start?", "most recent?"), read topRecommendation
resolve_reference(reference="10.1038/nrn3241")
→ exact DOI/arXiv/paper URL should resolve directly when supported
resolve_reference(reference="Rockstrom et al planetary boundaries 2009 Nature 461 472")
→ inspect status and bestMatch/alternatives
→ only treat bestMatch as citation-ready when status=resolved
→ if status=multiple_candidates or needs_disambiguation: pick a candidate or add a stronger author/year/venue clue before citing it
→ if resolved: run research with the resolved anchor
inspect_source(searchSessionId="...", evidenceId="...")
→ inspect verificationStatus, topicalRelevance, whyClassifiedAsWeakMatch, confidenceSignals, canonicalUrl, directReadRecommendations
→ if searchSessionId is omitted and inference is ambiguous, rerun with an explicit saved session id
research.resultStatus is abstained or needs_disambiguation, do not invent
synthesis. Narrow with a concrete anchor: DOI, exact title, species name,
agency, year, or venue.research returns needs_disambiguation with
clarification.reason=underspecified_reference_fragment, the server is
intentionally stopping before speculative retrieval on a vague
citation/reference fragment. Tighten the clue set or switch to
resolve_reference.follow_up_research.answerStatus is abstained or
insufficient_evidence, treat it as a safety signal. Use inspect_source
and rerun research with tighter scope.PAPER_CHASER_TOOL_PROFILE=expert
→ search_papers_smart / ask_result_set / map_research_landscape / expand_research_graph
→ search_papers / search_papers_bulk and provider-specific families
→ search_federal_register / get_federal_register_document / get_cfr_text for direct regulatory primary-source control
For expert smart tools, deep is the default quality-first mode. Use
balanced only when lower latency matters enough to justify a narrower pass,
and reserve fast for smoke tests or debugging.
Guided research no longer accepts a public latencyProfile knob. The server
owns that policy and currently applies a deep-backed quality-first path with
one bounded review escalation when the first pass is too weak.
Treat these as the main guided contracts:
| Field or pattern | Where it appears | What to do with it |
|---|---|---|
resultStatus | research | succeeded, partial, needs_disambiguation, abstained, failed |
answerability | research, follow_up_research | grounded, limited, insufficient |
evidence | research, follow_up_research | Canonical grounded source records for inspection and citation |
leads | research, follow_up_research, expert smart tools | Review weak, filtered, or off-topic leads without promoting them into grounded evidence |
evidenceGaps | research, follow_up_research | Treat as explicit limits on the current answer, not hidden caveats |
routingSummary | research, follow_up_research | Check intent, anchor, provider plan, regulatory subtype or entity card when present, and why the result is partial |
coverageSummary | research, follow_up_research | Check provider coverage and completeness before relying on synthesis |
executionProvenance | guided tools | Inspect which server policy, latency defaults, and fallback path produced the result |
confidenceSignals | research, follow_up_research, inspect_source | Inspect additive trust cues such as evidence quality, synthesis mode, and source-scope labels without replacing answerability |
evidenceUsePlan | follow_up_research | For synthesis-style follow-ups, inspect answer subtype, directly responsive evidence ids, unsupported parts, and retrieval sufficiency before trusting the answer |
sessionResolution | follow_up_research, inspect_source | Use when a session was inferred, repaired, missing, or ambiguous |
sourceResolution | inspect_source | Use when the requested source id was matched, unresolved, or needs a retry with available ids |
abstentionDetails | guided tools on weak evidence | Treat as the actionable reason and recovery hint for abstention or insufficient evidence |
nextActions | guided tools | Treat as server-preferred recovery path on weak evidence |
clarification | research | Ask the user only when a bounded clarification request is provided |
answerStatus | follow_up_research | answered, abstained, insufficient_evidence. Grounded answered requires on-topic verified source + qa-readable text + non-deterministic provider + medium+ confidence; otherwise expect insufficient_evidence. |
topRecommendation | follow_up_research (comparative/selection asks) | Structured pick with sourceId, recommendationReason, comparativeAxis (e.g. beginner_friendly, recency, authority). Unique anchored "where should I start?" asks can safely answer through this path even when broader synthesis would stay limited. |
responseMode | follow_up_research input | compact (default, hides full sources and legacy fields), standard, debug. |
includeLegacyFields | follow_up_research input | Set true to restore legacy verifiedFindings/unverifiedLeads in compact mode. |
fullTextUrlFound / bodyTextEmbedded / qaReadableText | inspect_source | Distinguish URL discovery, embedded body text, and text actually available to QA synthesis. fullTextObserved may still appear as a compatibility alias, but the split fields are the durable contract. |
evidenceId | evidence[*] | Pass to inspect_source for per-source provenance checks |
runtimeSummary | get_runtime_status and expert diagnostics | Confirm effective profile, smart provider state, and warnings |
For broad agency-guidance discovery, guided routing stays on the
regulatory primary-source path. Off-topic authority documents may still appear
as leads, but they should not displace more relevant query-anchored guidance
or policy documents from the top-level recommendation.
For source-level audits, treat whyClassifiedAsWeakMatch and
confidenceSignals.sourceScopeLabel / confidenceSignals.sourceScopeReason as
the primary explanation of why an authoritative record was retained as a weak
match or off-topic lead.
Additional trust and grounding signals landed in the llm-guidance phase-4
wave. Guided responses can expose confidenceSignals.evidenceQualityProfile,
confidenceSignals.synthesisMode, confidenceSignals.evidenceProfileDetail,
confidenceSignals.synthesisPath, confidenceSignals.trustRevisionNarrative,
and a trustSummary.authoritativeButWeak bucket for primary-source records
that are authoritative but not topically responsive. searchStrategy may
surface regulatoryIntent, intentFamily, a subjectCard for species and
regulatory grounding, and subjectChainGaps describing missing subject-chain
evidence. inspect_source pairs each direct-read suggestion with a
directReadRecommendationDetails entry shaped as
{trustLevel, whyRecommended, cautions} so agents can prioritize direct reads
by quality. See Paper Chaser Golden Paths and
Guided And Smart Robustness Notes for how
to read and act on these signals.
Session export is intentionally deferred in this wave. The planned future shape is
export_search_session(searchSessionId, format) with format in ris,
bibtex, or csv, using the guided-v2 source/citation schema so export can land
without another public-contract rewrite.
If you previously used the smart/raw-first surface directly:
research instead of search_papers_smart or search_papers.follow_up_research instead of ask_result_set for default grounded QA.resolve_reference instead of resolve_citation/search_papers_match
as your first known-item recovery step.PAPER_CHASER_TOOL_PROFILE=expert.latencyProfile to guided research; the server now owns that
policy internally.deep is now the default. Choose balanced
explicitly when you want the lower-latency fallback.executionProvenance, sessionResolution,
sourceResolution, and abstentionDetails.For the detailed breaking-change note, see Guided Reset Migration Note.
Current distribution options:
v* tags build wheel and sdist artifacts and attach them to a draft GitHub Release for review.For local source installs:
pip install -e .
Optional extras for the additive AI layer:
pip install -e ".[ai]"pip install -e ".[ai,openai]"pip install -e ".[ai,huggingface]"pip install -e ".[ai,nvidia]"pip install -e ".[ai,anthropic]"pip install -e ".[ai,google]"pip install -e ".[ai,mistral]"pip install -e ".[eval-foundry]"pip install -e ".[eval-huggingface]"pip install -e ".[eval]",ai-faiss to any of the commands above if you want the optional FAISS backend.Azure OpenAI uses the same openai extra.
Hugging Face uses a dedicated huggingface extra that installs the OpenAI-compatible SDK plus the LangChain OpenAI adapter; this repo documents it as a chat-only smart-provider path with embeddings disabled.
The eval publishing helpers use separate extras on purpose: eval-foundry is for Azure AI Foundry dataset upload support, and eval-huggingface is for Hugging Face dataset-repo or bucket publishing support. Those extras are independent from the smart-provider chat runtime.
The full local environment-variable contract lives in .env.example. That file mirrors the public local knobs supported by docker-compose.yaml. Azure-specific identifiers, secrets, and Bicep parameters are intentionally documented separately in docs/azure-deployment.md.
Use stdio transport for desktop MCP clients unless you specifically need HTTP. See the Quick start JSON example above for the server definition.
~/Library/Application Support/Claude/claude_desktop_config.json%APPDATA%\Claude\claude_desktop_config.jsonGuided mode starts from research. The brokered and provider-specific controls
below are expert-path controls.
| Area | Default | Main variables | Notes |
|---|---|---|---|
| Tool profile | guided | PAPER_CHASER_TOOL_PROFILE | guided exposes the 5 low-context tools; expert exposes the broader raw/provider-specific surface, subject to enabled features and PAPER_CHASER_HIDE_DISABLED_TOOLS |
| Guided policy | quality-first | PAPER_CHASER_GUIDED_RESEARCH_LATENCY_PROFILE, PAPER_CHASER_GUIDED_FOLLOW_UP_LATENCY_PROFILE, PAPER_CHASER_GUIDED_ALLOW_PAID_PROVIDERS, PAPER_CHASER_GUIDED_ESCALATION_ENABLED, PAPER_CHASER_GUIDED_ESCALATION_MAX_PASSES, PAPER_CHASER_GUIDED_ESCALATION_ALLOW_PAID_PROVIDERS | Guided research / follow_up_research use these server-owned defaults instead of honoring client latencyProfile knobs |
| Search broker | semantic_scholar,arxiv,core,serpapi_google_scholar | PAPER_CHASER_ENABLE_SEMANTIC_SCHOLAR, PAPER_CHASER_ENABLE_ARXIV, PAPER_CHASER_ENABLE_CORE, PAPER_CHASER_ENABLE_SERPAPI, PAPER_CHASER_PROVIDER_ORDER | SerpApi is opt-in and paid; CORE is off by default |
| OpenAlex tool family | enabled | PAPER_CHASER_ENABLE_OPENALEX, OPENALEX_API_KEY, OPENALEX_MAILTO | Explicit tool family, not a default broker hop |
| ScholarAPI tool family | disabled | PAPER_CHASER_ENABLE_SCHOLARAPI, SCHOLARAPI_API_KEY | Explicit discovery, monitoring, full-text, and PDF family; also available as an opt-in broker target via preferredProvider or providerOrder. ScholarAPI-sourced paper results now include a separate contentAccess block for access/full-text metadata. |
| Enrichment | enabled | PAPER_CHASER_ENABLE_CROSSREF, CROSSREF_MAILTO, CROSSREF_TIMEOUT_SECONDS, PAPER_CHASER_ENABLE_UNPAYWALL, UNPAYWALL_EMAIL, UNPAYWALL_TIMEOUT_SECONDS, PAPER_CHASER_ENABLE_OPENALEX | Used after you already have a paper or DOI |
| ECOS | enabled | PAPER_CHASER_ENABLE_ECOS, ECOS_BASE_URL, ECOS_TIMEOUT_SECONDS, document timeout and size vars, TLS vars | Species and document workflows |
| Federal Register / GovInfo | enabled | PAPER_CHASER_ENABLE_FEDERAL_REGISTER, PAPER_CHASER_ENABLE_GOVINFO_CFR, GOVINFO_API_KEY, GovInfo timeout and size vars | Federal Register search is keyless; authoritative CFR retrieval uses GovInfo |
| Smart layer | disabled | OPENAI_API_KEY, OPENROUTER_API_KEY, OPENROUTER_BASE_URL, OPENROUTER_HTTP_REFERER, OPENROUTER_TITLE, HUGGINGFACE_API_KEY, HUGGINGFACE_BASE_URL, NVIDIA_API_KEY, NVIDIA_NIM_BASE_URL, AZURE_OPENAI_API_KEY, AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_VERSION, ANTHROPIC_API_KEY, GOOGLE_API_KEY, MISTRAL_API_KEY, PAPER_CHASER_ENABLE_AGENTIC, model and index vars | Additive only; supports openai, azure-openai, anthropic, nvidia, google, mistral, huggingface, openrouter, and deterministic. OpenAI ships with checked-in model defaults, Anthropic, NVIDIA, Google, and Mistral auto-swap to provider defaults when those OpenAI defaults are left untouched, and Azure OpenAI can override both roles with deployment names. Hugging Face and OpenRouter are documented as OpenAI-compatible chat routers configured with HUGGINGFACE_BASE_URL and OPENROUTER_BASE_URL; both remain chat-only in this repo and do not enable embeddings. OpenRouter preserves explicit planner and synthesis model names such as provider-prefixed model IDs. NVIDIA_NIM_BASE_URL is optional for self-hosted NIMs; leave it empty for hosted NVIDIA API Catalog access. Embeddings remain disabled by default because they have been unreliable in this codebase, and improving them is out of scope for the current release. When ScholarAPI is enabled, smart discovery can also route through it and cap it via providerBudget.maxScholarApiCalls. |
| Hide disabled tools | guided default true, expert default false | PAPER_CHASER_HIDE_DISABLED_TOOLS | Guided mode keeps this on to reduce dead-end tool picks; expert mode usually leaves it off for operator visibility |
These are the effective planner/synthesis defaults when you enable PAPER_CHASER_ENABLE_AGENTIC=true and do not intentionally override the model vars.
PAPER_CHASER_AGENTIC_PROVIDER | Default planner | Default synthesis | Resolution rule |
|---|---|---|---|
openai | gpt-5.4-mini | gpt-5.4 | Uses the checked-in PAPER_CHASER_PLANNER_MODEL and PAPER_CHASER_SYNTHESIS_MODEL defaults directly |
azure-openai | gpt-5.4-mini | gpt-5.4 | Uses the same model vars unless AZURE_OPENAI_PLANNER_DEPLOYMENT or AZURE_OPENAI_SYNTHESIS_DEPLOYMENT is set; when present, those deployment names win |
anthropic | claude-haiku-4-5 | claude-sonnet-4-6 | Runtime swaps to these provider defaults only when planner/synthesis are still set to the checked-in OpenAI defaults |
nvidia | nvidia/nemotron-3-nano-30b-a3b | nvidia/nemotron-3-super-120b-a12b | Runtime swaps to these provider defaults only when planner/synthesis are still set to the checked-in OpenAI defaults |
google | gemini-2.5-flash | gemini-2.5-pro | Runtime swaps to these provider defaults only when planner/synthesis are still set to the checked-in OpenAI defaults |
mistral | mistral-medium-latest | mistral-large-latest | Runtime swaps to these provider defaults only when planner/synthesis are still set to the checked-in OpenAI defaults |
huggingface | moonshotai/Kimi-K2.5 | moonshotai/Kimi-K2.5 | Runtime swaps to these provider defaults only when planner/synthesis are still set to the checked-in OpenAI defaults; requests are sent to HUGGINGFACE_BASE_URL and the path remains chat-only |
openrouter | none | none | Runtime preserves explicit planner/synthesis model values and sends requests to OPENROUTER_BASE_URL; the first-pass path remains chat-only |
deterministic | n/a | n/a | No external LLM calls; model selection metadata is reported as deterministic instead |
PAPER_CHASER_EMBEDDING_MODEL defaults to text-embedding-3-large, but embeddings stay off until you set PAPER_CHASER_DISABLE_EMBEDDINGS=false. They remain off by default because embeddings have been unreliable in this codebase and improving them is outside the scope of the current guided-policy release. In the current provider surface, embeddings are only used by providers that explicitly support them, which means the documented Hugging Face path remains chat-only even though it uses an OpenAI-compatible router.
Recommended baseline: enable Semantic Scholar, OpenAlex, Crossref, and Unpaywall for general scholarly workflows; enable ScholarAPI when you want explicit full-text or PDF retrieval; keep SerpApi opt-in because it is a paid recall-recovery path.
Broker rules that matter most:
preferredProvider, providerOrder, and PAPER_CHASER_PROVIDER_ORDER accept core, semantic_scholar, arxiv, scholarapi, and serpapi or serpapi_google_scholar.publicationDateOrYear, fieldsOfStudy, publicationTypes, openAccessPdf, and minCitationCount can force the broker to skip incompatible providers.brokerMetadata.providerUsed, brokerMetadata.attemptedProviders, and brokerMetadata.recommendedPaginationTool so agents can follow the right next step.| Mode | Default | Main variables | Use when |
|---|---|---|---|
| Desktop stdio | stdio | none required | Claude Desktop, Cursor, local MCP subprocess launches |
| Direct HTTP run | opt in | PAPER_CHASER_TRANSPORT, PAPER_CHASER_HTTP_HOST, PAPER_CHASER_HTTP_PORT, PAPER_CHASER_HTTP_PATH | Local integration testing without the deployment wrapper |
| HTTP wrapper | opt in | PAPER_CHASER_HTTP_AUTH_TOKEN, PAPER_CHASER_HTTP_AUTH_HEADER, PAPER_CHASER_ALLOWED_ORIGINS | Local parity with hosted HTTP deployments |
| Docker Compose publish settings | localhost defaults | PAPER_CHASER_PUBLISHED_HOST, PAPER_CHASER_PUBLISHED_PORT | Control the host-side HTTP port mapping only |
Key distinctions:
PAPER_CHASER_HTTP_HOST and PAPER_CHASER_HTTP_PORT control the direct shell and hosted deployments. Docker Compose keeps the container bind at 0.0.0.0:8080 and uses PAPER_CHASER_PUBLISHED_HOST / PAPER_CHASER_PUBLISHED_PORT for the host-side mapping.paper-chaser-mcp deployment-http runs the deployment wrapper used by Compose and Azure. It adds /healthz plus optional auth and Origin enforcement in front of the MCP endpoint.Example direct local HTTP run:
PAPER_CHASER_TRANSPORT=streamable-http \
PAPER_CHASER_HTTP_HOST=127.0.0.1 \
PAPER_CHASER_HTTP_PORT=8000 \
python -m paper_chaser_mcp
If you need the full Azure deployment story, including the bootstrap and full workflow modes, read docs/azure-deployment.md, docs/azure-architecture.md, and docs/azure-security-model.md.
For local MCP clients that launch servers as subprocesses, use the image in
stdio mode. For unpublished local iteration, build and run paper-chaser-mcp:local.
For the reusable public package, use the published GHCR tag:
docker run --rm -i ghcr.io/joshuasundance-swca/paper-chaser-mcp:latest
For a locally built image:
docker run --rm -i paper-chaser-mcp:local
A Docker-backed MCP client entry typically looks like:
{
"mcpServers": {
"paper-chaser": {
"command": "docker",
"args": ["run", "--rm", "-i", "ghcr.io/joshuasundance-swca/paper-chaser-mcp:latest"]
}
}
}
This mode is ideal for local desktop MCP usage because the host launches and owns the server process lifecycle.
The repo also ships server.json so the public OCI image and MCP package
metadata stay aligned for registry/discovery tooling. The public-package
workflow is tag-driven for GHCR: a v* tag publishes the reusable container
image to ghcr.io/joshuasundance-swca/paper-chaser-mcp. MCP Registry
publication is intentionally decoupled into a separate manual workflow so GHCR
shipping does not depend on registry availability.
Python package publishing is prepared separately in
.github/workflows/publish-pypi.yml: pull requests build and twine check the
distribution, and the actual publish jobs stay dormant until the repository
variable ENABLE_PYPI_PUBLISHING is set to true. After PyPI/TestPyPI access
is restored and the trusted publishers are registered, manual dispatch can
publish to TestPyPI and a v* tag can publish to PyPI.
GitHub Release assets are handled separately in
.github/workflows/publish-github-release.yml: a v* tag or manual dispatch
builds wheel and sdist artifacts, verifies them with twine check, generates
SHA256SUMS, and uploads them to a draft GitHub Release page so Python
artifacts can be reviewed before broader public promotion.
For local HTTP testing, MCP Inspector, or bridge-style integrations, this repo
ships docker-compose.yaml with localhost-only defaults. Compose
explicitly starts the deployment-http subcommand, so HTTP wrapper behavior
does not depend on the image's default transport.
Compose keeps the container bind host and internal port fixed at
0.0.0.0:8080 and overrides the app default transport to streamable-http,
so browser tools and bridge-style clients can connect over
http://127.0.0.1:8000/mcp without extra shell flags. The compose file
exposes the user-facing knobs: transport, MCP path, provider keys, provider
toggles, auth, and the published host port mapping.
.env.example to .env.docker compose -f docker-compose.yaml up --build
The service listens on http://127.0.0.1:8000 by default, serves
/healthz for probes, and exposes MCP over http://127.0.0.1:8000/mcp.
curl http://127.0.0.1:8000/healthz
If you set PAPER_CHASER_HTTP_AUTH_TOKEN and leave
PAPER_CHASER_HTTP_AUTH_HEADER=authorization, the deployment wrapper expects
Authorization: Bearer <token> on /mcp. The checked-in Azure scaffold
overrides the header name to x-backend-auth and has API Management inject
that header for backend-only traffic. The published host defaults to
127.0.0.1; only change PAPER_CHASER_PUBLISHED_HOST when you intentionally
want the container reachable beyond the local machine.
If you leave the provider key fields blank, local clients still work. The server falls back to the free/default provider paths where supported, and SerpApi stays disabled by default.
For browser-based debugging without installing Node locally, use the dedicated Inspector stack:
docker compose -f compose.inspector.yaml up --build
This stack keeps Inspector separate from the MCP server image and binds the UI and proxy to localhost only:
http://127.0.0.1:6274http://127.0.0.1:6277Inspector proxy authentication remains enabled by default. Use
docker compose -f compose.inspector.yaml logs mcp-inspector to read the
session token that Inspector prints on startup.
Inside Inspector, connect using Streamable HTTP and set:
http://paper-chaser-mcp:8080/mcpstreamable-httpcompose.inspector.yaml accepts IMAGE overrides, so you can test a specific
tag without editing files:
IMAGE=ghcr.io/joshuasundance-swca/paper-chaser-mcp:latest docker compose -f compose.inspector.yaml up
Full tool reference. See the Quick tool decision guide above for where to start.
| Tool | Description |
|---|---|
research | Default trust-graded entrypoint for discovery, known-item recovery, citation repair, and regulatory routing. |
follow_up_research | Grounded follow-up over a saved searchSessionId; returns explicit abstention/insufficient-evidence states when needed. |
resolve_reference | Resolve citation-like input (citation, DOI, arXiv, URL, title fragment, regulatory reference) into the safest next anchor. |
inspect_source | Inspect one sourceId from a guided result set for provenance, trust state, and direct-read follow-through. |
get_runtime_status | Guided runtime summary for active profile, transport, smart-provider state, and warnings. |
These tools are expert profile paths for deeper orchestration and provider control.
| Tool | Description |
|---|---|
search_papers_smart | Concept-level discovery with query expansion, multi-provider fusion, reranking, reusable searchSessionId, and an evidence-first expert contract (resultStatus, answerability, routingSummary, evidence, leads, evidenceGaps, structuredSources, coverageSummary, failureSummary). Legacy trust fields remain available as compatibility views. In auto mode it can also route clearly regulatory asks into a primary-source timeline. latencyProfile defaults to deep for highest-quality expert work; use balanced for lower latency and reserve fast for smoke tests. Optional providerBudget remains available for advanced clients. |
ask_result_set | Grounded QA, claim checks, and comparisons over a saved searchSessionId. |
map_research_landscape | Cluster a saved result set into themes, gaps, disagreements, and next-search suggestions. |
expand_research_graph | Expand paper anchors or a saved session into a citation/reference/author graph with frontier ranking. |
| Tool | Description |
|---|---|
search_papers | Brokered single-page search (default: Semantic Scholar → arXiv → CORE → SerpApi). Read brokerMetadata.nextStepHint; ScholarAPI is also available as an explicit opt-in broker target. |
search_papers_bulk | Paginated bulk search (Semantic Scholar) up to 1,000 papers/call with boolean query syntax. |
search_papers_semantic_scholar | Single-page Semantic Scholar-only search with full filter support. |
search_papers_arxiv | Single-page arXiv-only search. |
search_papers_core | Single-page CORE-only search. |
search_papers_serpapi | Single-page SerpApi Google Scholar search. Requires SerpApi. |
search_papers_scholarapi | Single-page ScholarAPI relevance-ranked search. Requires ScholarAPI. |
search_papers_openalex | Single-page OpenAlex-only search. |
search_papers_openalex_bulk | Cursor-paginated OpenAlex search. |
list_papers_scholarapi | Cursor-paginated ScholarAPI monitoring/list flow sorted by indexed_at. |
search_papers_openalex_by_entity | OpenAlex works constrained to one source, institution, or topic entity ID. |
| Tool | Description |
|---|---|
resolve_citation | Citation-repair workflow for incomplete or malformed references. Abstains on regulatory references. |
search_papers_match | Known-item lookup for messy or partial titles with cross-provider confirmation. |
get_paper_details | Lookup by DOI, arXiv ID, Semantic Scholar ID, or URL. Optional includeEnrichment. |
get_paper_details_openalex | OpenAlex work lookup by W-id, URL, or DOI with abstract reconstruction. |
paper_autocomplete | Paper title typeahead completions. |
paper_autocomplete_openalex | OpenAlex work typeahead for known-item disambiguation. |
| Tool | Description |
|---|---|
get_paper_citations | Papers that cite the given paper (Semantic Scholar). Cursor-paginated. |
get_paper_citations_openalex | OpenAlex cited-by expansion. Cursor-paginated. |
get_paper_references | References behind the given paper (Semantic Scholar). Cursor-paginated. |
get_paper_references_openalex | OpenAlex backward-reference expansion. Cursor-paginated. |
get_paper_authors | Authors of the given paper (Semantic Scholar). |
search_authors | Search authors by name (Semantic Scholar). |
search_authors_openalex | Search OpenAlex authors by name. |
get_author_info | Author profile by Semantic Scholar author ID. |
get_author_info_openalex | OpenAlex author profile by A-id or URL. |
get_author_papers | Papers by Semantic Scholar author. Cursor-paginated. |
get_author_papers_openalex | Papers by OpenAlex author with year filter and cursor pagination. |
batch_get_papers | Details for up to 500 paper IDs in one call. |
batch_get_authors | Details for up to 1,000 author IDs in one call. |
get_paper_recommendations | Similar papers by single seed (GET). |
get_paper_recommendations_post | Similar papers from positive/negative seed sets (POST). |
| Tool | Description |
|---|---|
enrich_paper | Combined Crossref + Unpaywall + OpenAlex enrichment for one known paper or DOI. Query-only calls without an anchor abstain instead of resolving a paper. |
get_paper_metadata_crossref | Explicit Crossref enrichment for a known paper or DOI. |
get_paper_open_access_unpaywall | Unpaywall OA status, PDF URL, and license lookup by DOI. Requires UNPAYWALL_EMAIL. |
| Tool | Description |
|---|---|
get_paper_text_scholarapi | Fetch one ScholarAPI plain-text full document by ScholarAPI paper id. |
get_paper_texts_scholarapi | Batch full-text retrieval for up to 100 ScholarAPI paper ids. Preserves order and null placeholders. |
get_paper_pdf_scholarapi | Fetch one ScholarAPI PDF as structured metadata plus base64-encoded content. |
| Tool | Description |
|---|---|
search_entities_openalex | Search OpenAlex source, institution, or topic entities for pivot workflows. |
Documentation truncated — see the full README on GitHub.
Be the first to review this server!
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption