Server data from the Official MCP Registry
Intelligent web content fetcher MCP server that converts HTML to clean, AI-readable JSONL format
Intelligent web content fetcher MCP server that converts HTML to clean, AI-readable JSONL format
Valid MCP server (2 strong, 1 medium validity signals). No known CVEs in dependencies. ⚠️ Package registry links to a different repository than scanned source. Imported from the Official MCP Registry. Trust signals: trusted author (16/16 approved); 3 highly-trusted packages. 1 finding(s) downgraded by scanner intelligence.
3 files analyzed · 1 issue found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-j0hanz-superfetch": {
"args": [
"-y",
"@j0hanz/fetch-url-mcp"
],
"command": "npx"
}
}
}From the project's GitHub README.
An MCP server that fetches web pages and converts them to clean, readable Markdown.
This server takes a URL, fetches the page, and strips away everything you don't need — navigation, sidebars, banners, scripts — leaving just the main content as Markdown. It's perfect for feeding into LLMs, giving them the distilled essence of a page without the noise. It also recognizes GitHub, GitLab, Bitbucket, and Gist URLs and rewrites them to fetch the raw content directly.
By default it runs over stdio. Pass --http if you need a proper HTTP endpoint with auth, rate limiting, TLS, and session support.
title, url, contentSize, and truncated.notifications/progress.internal://instructions resource and a get-help prompt so clients know how to use it.A browser-based client is available if you want to use the server without any MCP setup.
Add this to your MCP client config:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
Add to .vscode/mcp.json:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
Or install via CLI:
code --add-mcp '{"name":"fetch-url-mcp","command":"npx","args":["-y","@j0hanz/fetch-url-mcp@latest"]}'
For more info, see VS Code MCP docs.
Add to .vscode/mcp.json:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
Or install via CLI:
code-insiders --add-mcp '{"name":"fetch-url-mcp","command":"npx","args":["-y","@j0hanz/fetch-url-mcp@latest"]}'
For more info, see VS Code Insiders MCP docs.
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Cursor MCP docs.
For solution-scoped setup, add this to .mcp.json at the solution root:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Visual Studio MCP docs.
Add to ~/.config/goose/config.yaml on macOS/Linux or %APPDATA%\Block\goose\config\config.yaml on Windows:
extensions:
fetch-url-mcp:
name: fetch-url-mcp
cmd: npx
args: ['-y', '@j0hanz/fetch-url-mcp@latest']
enabled: true
type: stdio
timeout: 300
For more info, see Goose extension docs.
Add to ~/.lmstudio/mcp.json on macOS/Linux or %USERPROFILE%/.lmstudio/mcp.json on Windows:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see LM Studio MCP docs.
Add to claude_desktop_config.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Claude Desktop MCP docs.
Use the CLI:
claude mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
For project-scoped config, Claude Code writes .mcp.json with:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"],
"env": {}
}
}
}
For more info, see Claude Code MCP docs.
Add to ~/.codeium/windsurf/mcp_config.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Windsurf MCP docs.
Add to ~/.config/amp/settings.json on macOS/Linux, %USERPROFILE%\.config\amp\settings.json on Windows, or .amp/settings.json for workspace-scoped config:
{
"amp.mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
Or install via CLI:
amp mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
For more info, see Amp docs.
Open the MCP Servers panel, choose Configure MCP Servers, and add this to cline_mcp_settings.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Cline MCP docs.
Use the CLI:
codex mcp add fetch-url-mcp -- npx -y @j0hanz/fetch-url-mcp@latest
Or add this to ~/.codex/config.toml or project-scoped .codex/config.toml:
[mcp_servers.fetch-url-mcp]
command = "npx"
args = ["-y", "@j0hanz/fetch-url-mcp@latest"]
For more info, see Codex MCP docs.
Add to .vscode/mcp.json:
{
"servers": {
"fetch-url-mcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see GitHub Copilot MCP docs.
Open Personal > MCP Servers in Warp, choose + Add, and either add a CLI server with:
command: npxargs: ["-y", "@j0hanz/fetch-url-mcp@latest"]Or paste this JSON snippet when using Warp's multi-server import flow:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Warp MCP docs.
Use Kiro's MCP Servers panel or the Add to Kiro install flow. Kiro stores workspace-scoped MCP config in .kiro/settings/mcp.json and user-scoped config in ~/.kiro/settings/mcp.json.
For this server, use:
command: npxargs: ["-y", "@j0hanz/fetch-url-mcp@latest"]For more info, see Kiro MCP docs.
Add to ~/.gemini/settings.json:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Gemini CLI MCP docs.
Add to ~/.config/zed/settings.json:
{
"context_servers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"],
"env": {}
}
}
}
For more info, see Zed MCP docs.
Use the Augment Settings panel and either add the server manually or choose Import from JSON:
{
"mcpServers": {
"fetch-url-mcp": {
"command": "npx",
"args": ["-y", "@j0hanz/fetch-url-mcp@latest"]
}
}
}
For more info, see Augment MCP docs.
Use Roo Code's MCP Servers UI or marketplace flow.
For this server, use:
command: npxargs: ["-y", "@j0hanz/fetch-url-mcp@latest"]For more info, see Roo Code docs.
Use Kilo Code's MCP Servers UI or marketplace flow.
For this server, use:
command: npxargs: ["-y", "@j0hanz/fetch-url-mcp@latest"]For more info, see Kilo Code docs.
tasks/get exposes the latest task summary fields such as statusMessage, createdAt, lastUpdatedAt, ttl, and pollInterval.[MCP Client]
├─ stdio -> `src/index.ts` -> `startStdioServer()` -> `createMcpServer()`
└─ HTTP (`--http`) -> `src/index.ts` -> `startHttpServer()` -> HTTP dispatcher
├─ `GET /health`
├─ `GET /.well-known/oauth-protected-resource`
├─ `GET /.well-known/oauth-protected-resource/mcp`
└─ `POST|GET|DELETE /mcp`
`createMcpServer()`
├─ registers tool: `fetch-url`
├─ registers prompt: `get-help`
├─ registers resources:
│ - `internal://instructions`
├─ enables capabilities: logging, resources, prompts, tasks
└─ installs task handlers, log-level handling, and shutdown cleanup
`fetch-url` execution
├─ validate input with `fetchUrlInputSchema`
├─ normalize URL and block local/private targets unless allowed
├─ rewrite supported code-host URLs to raw endpoints when possible
├─ fetch content via the shared pipeline
├─ transform HTML into Markdown in the transform worker path
└─ validate `structuredContent` with `fetchUrlOutputSchema`
[Client] -- initialize {protocolVersion, capabilities} --> [Server]
[Server] -- {protocolVersion, capabilities, serverInfo} --> [Client]
[Client] -- notifications/initialized --> [Server]
[Client] -- tools/call {name, arguments} --> [Server]
[Server] -- {content: [{type, text}], structuredContent?, isError?} --> [Client]
fetch-urlTakes a URL and returns Markdown. Read-only, with no JavaScript execution. Supports running as a background MCP task for large or slow pages. In task mode, tasks/get and tasks/list expose task summaries including status, statusMessage, createdAt, lastUpdatedAt, ttl, and pollInterval; numeric progress remains out-of-band via notifications/progress.
| Parameter | Type | Required | Description |
|---|---|---|---|
url | string | yes | Target URL. Max 2048 chars. |
You get text content back by default. If output validation passes, the response also includes structuredContent with typed fields: url, resolvedUrl, finalUrl, title, metadata, markdown, fetchedAt, contentSize, and truncated. A true value for truncated means the content hit a server-side size limit.
To opt into progress updates, include _meta.progressToken in the tool call. The token may be a string or number, and the server may emit monotonic notifications/progress updates while the fetch runs.
To run the tool in task mode, include params.task = { ttl?: <ms> }. tasks/result waits until the task reaches a terminal status and then returns the stored output or a terminal error payload for cancelled or failed tasks. Task summaries and final results include _meta["io.modelcontextprotocol/related-task"] = { "taskId": "<server-task-id>" }.
{
"method": "tools/call",
"params": {
"name": "fetch-url",
"arguments": {
"url": "https://example.com/docs"
},
"task": {
"ttl": 300000,
"pollInterval": 1000
},
"_meta": {
"progressToken": 7
}
}
}
1. [Client] -- tools/call {name: "fetch-url", arguments} --> [Server]
2. [Server] -- dispatch("fetch-url") --> [src/tools/fetch-url.ts]
3. [Handler] -- validate(fetchUrlInputSchema) --> normalize / fetch / transform
4. [Handler] -- validate(fetchUrlOutputSchema) --> assemble content + structuredContent
5. [Server] -- result or tool error --> [Client]
| Resource | URI | MIME Type | Description |
|---|---|---|---|
fetch-url-mcp-instructions | internal://instructions | text/markdown | Guidance for using the Fetch URL MCP server. |
| Prompt | Arguments | Description |
|---|---|---|
get-help | topic? | Return Fetch URL server instructions: workflows, task mode, and error handling. Optional values: capabilities, workflows, constraints, errors. |
| Capability | Status | Notes |
|---|---|---|
| completions | confirmed | Advertised in createServerCapabilities(). |
| logging | confirmed | Advertised in createServerCapabilities(). |
| resources | confirmed | Static instruction resource is registered during server startup. Subscription and list-changed support are not advertised. |
| prompts | confirmed | get-help is registered during server startup. |
| tasks | confirmed | Advertised in createServerCapabilities() and backed by registered task handlers plus optional tool task support. |
| progress notifications | confirmed | Opt-in via _meta.progressToken. Tool execution reports monotonic notifications/progress updates during fetch and transform stages. |
| task status updates | confirmed | notifications/tasks/status is emitted when task status changes and TASKS_STATUS_NOTIFICATIONS=true. |
| Annotation | Value |
|---|---|
readOnlyHint | true |
destructiveHint | false |
idempotentHint | true |
openWorldHint | true |
The tool declares an outputSchema and includes structuredContent in the response when validation passes. Clients that support structured output get typed data directly; the rest use the text fallback.
All configuration is through environment variables. For basic stdio usage, nothing needs to be set.
| Variable | Default | Notes |
|---|---|---|
HOST | 127.0.0.1 | Bind address. Non-loopback bindings also require ALLOW_REMOTE=true. |
PORT | 3000 | Listening port for --http. |
ALLOW_REMOTE | false | Must be enabled to bind to a non-loopback interface. |
ALLOWED_HOSTS | empty | Additional allowed Host and Origin values. |
SERVER_MAX_CONNECTIONS | 0 | Optional connection cap. |
SERVER_TRUST_PROXY | false | Trust X-Forwarded-For / Forwarded for client IP resolution. |
SERVER_HEADERS_TIMEOUT_MS | unset | Optional Node server tuning. |
SERVER_REQUEST_TIMEOUT_MS | unset | Optional Node server tuning. |
SERVER_KEEP_ALIVE_TIMEOUT_MS | unset | Optional keep-alive tuning. |
SERVER_KEEP_ALIVE_TIMEOUT_BUFFER_MS | unset | Optional keep-alive tuning buffer. |
SERVER_MAX_HEADERS_COUNT | unset | Optional header count limit. |
SERVER_BLOCK_PRIVATE_CONNECTIONS | false | Enables inbound private-network protections when not trusting a proxy. |
| Variable | Default | Notes |
|---|---|---|
ACCESS_TOKENS | unset | Comma- or space-separated static bearer tokens. |
API_KEY | unset | Alternate static token source for header auth. |
OAUTH_ISSUER_URL | unset | Enables OAuth mode when combined with the other OAuth URLs. |
OAUTH_AUTHORIZATION_URL | unset | Optional explicit authorization endpoint. |
OAUTH_TOKEN_URL | unset | Optional explicit token endpoint. |
OAUTH_REVOCATION_URL | unset | Optional OAuth revocation endpoint. |
OAUTH_REGISTRATION_URL | unset | Optional OAuth dynamic client registration endpoint. |
OAUTH_INTROSPECTION_URL | unset | Required for OAuth token introspection. |
OAUTH_REQUIRED_SCOPES | empty | Required scopes enforced after auth. |
OAUTH_CLIENT_ID | unset | Optional introspection client ID. |
OAUTH_CLIENT_SECRET | unset | Optional introspection client secret. |
| Variable | Default | Notes |
|---|---|---|
SERVER_TLS_KEY_FILE | unset | Enable HTTPS when set together with SERVER_TLS_CERT_FILE. |
SERVER_TLS_CERT_FILE | unset | TLS certificate path. |
SERVER_TLS_CA_FILE | unset | Optional custom CA bundle. |
| Variable | Default | Notes |
|---|---|---|
ALLOW_LOCAL_FETCH | false | Allows loopback and private-network fetch targets. |
FETCH_TIMEOUT_MS | 15000 | Network fetch timeout in milliseconds. |
USER_AGENT | fetch-url-mcp/<version> | Override the outbound user agent string. |
| Variable | Default | Notes |
|---|---|---|
MAX_INLINE_CONTENT_CHARS | 0 | 0 means no explicit inline truncation limit. |
| Variable | Default | Notes |
|---|---|---|
TASKS_MAX_TOTAL | 5000 | Total retained task capacity, including completed/cancelled tasks until they expire. |
TASKS_MAX_PER_OWNER | 1000 | Per-owner retained task cap, clamped to the total cap. |
TASKS_STATUS_NOTIFICATIONS | false | Enables status notifications for tasks. |
TASKS_REQUIRE_INTERCEPTION | true | Requires interception for task-capable tool execution. |
| Variable | Default | Notes |
|---|---|---|
TRANSFORM_CANCEL_ACK_TIMEOUT_MS | 200 | Cancellation acknowledgement timeout. |
TRANSFORM_WORKER_MODE | threads | Worker execution mode. |
TRANSFORM_WORKER_MAX_OLD_GENERATION_MB | unset | Optional worker memory limit. |
TRANSFORM_WORKER_MAX_YOUNG_GENERATION_MB | unset | Optional worker memory limit. |
TRANSFORM_WORKER_CODE_RANGE_MB | unset | Optional worker memory limit. |
TRANSFORM_WORKER_STACK_MB | unset | Optional worker stack size. |
| Variable | Default | Notes |
|---|---|---|
FETCH_URL_MCP_EXTRA_NOISE_TOKENS | empty | Extra noise-removal tokens. |
FETCH_URL_MCP_EXTRA_NOISE_SELECTORS | empty | Extra DOM selectors for noise removal. |
FETCH_URL_MCP_LOCALE | system default | Locale override for extraction heuristics. |
MARKDOWN_HEADING_KEYWORDS | built-in list | Override heading keywords used by cleanup. |
| Variable | Default | Notes |
|---|---|---|
LOG_LEVEL | info | debug, info, warn, or error. |
LOG_FORMAT | text | Set to json for structured logs. |
| Method | Path | Auth | Purpose |
|---|---|---|---|
GET | /health | no, unless ?verbose=1 on a remote server | Basic health response, with optional diagnostics. |
GET | /.well-known/oauth-protected-resource | no | OAuth protected-resource metadata. |
GET | /.well-known/oauth-protected-resource/mcp | no | OAuth protected-resource metadata for the MCP endpoint. |
POST | /mcp | yes | Session initialization and JSON-RPC requests. |
GET | /mcp | yes | Session-bound server-to-client stream handling. |
DELETE | /mcp | yes | Session shutdown. |
| Control | Status | Notes |
|---|---|---|
| Host and origin validation | implemented | HTTP requests are rejected unless Host and Origin match the allowlist built from loopback, the configured host, and ALLOWED_HOSTS. |
| Authentication | implemented | HTTP mode supports static bearer tokens locally or OAuth token introspection; remote bindings require OAuth. |
| Protocol version checks | implemented | Session-bound MCP HTTP requests validate MCP-Protocol-Version and pin it to the negotiated session version. |
| Rate limiting | implemented | Requests pass through the HTTP rate limiter before route dispatch. Enable SERVER_TRUST_PROXY=true behind a trusted reverse proxy so limits apply to the forwarded client IP instead of the proxy hop. |
| Outbound SSRF protections | implemented | Local/private IPs, metadata endpoints, and .local/.internal hosts are blocked unless ALLOW_LOCAL_FETCH=true. |
| TLS | optional | HTTPS is enabled when both TLS key and certificate files are configured. |
| Stdio logging safety | implemented | Server logs are written to stderr, not stdout, so stdio MCP traffic stays clean. |
| Command | Description |
|---|---|
npm run build | Clean, compile TypeScript, copy assets. |
npm run dev | Watch mode TypeScript compilation. |
npm run dev:run | Run the server with --watch and .env support. |
npm start | Start the compiled server. |
npm test | Run the full test suite. |
npm run lint | Lint with ESLint. |
npm run lint:fix | Auto-fix lint issues. |
npm run type-check | Type-check source and tests. |
npm run format | Format with Prettier. |
npm run inspector | Build and launch MCP Inspector. |
| Script | Command |
|---|---|
clean | node scripts/tasks.mjs clean |
build | node scripts/tasks.mjs build |
copy:assets | node scripts/tasks.mjs copy:assets |
prepare | npm run build |
dev | tsc --watch --preserveWatchOutput |
dev:run | node --env-file=.env --watch dist/index.js |
start | node dist/index.js |
format | prettier --write . |
type-check | node scripts/tasks.mjs type-check |
type-check:src | node node_modules/typescript/bin/tsc -p tsconfig.json --noEmit |
type-check:tests | node node_modules/typescript/bin/tsc -p tsconfig.test.json --noEmit |
type-check:diagnostics | tsc --noEmit --extendedDiagnostics |
type-check:trace | node -e "require('fs').rmSync('.ts-trace',{recursive:true,force:true})" && tsc --noEmit --generateTrace .ts-trace |
lint | eslint . |
lint:tests | eslint src/__tests__ |
lint:fix | eslint . --fix |
test | node scripts/tasks.mjs test |
test:fast | node --test --import tsx/esm src/__tests__/**/*.test.ts node-tests/**/*.test.ts |
test:coverage | node scripts/tasks.mjs test --coverage |
knip | knip |
knip:fix | knip --fix |
inspector | npm run build && npx -y @modelcontextprotocol/inspector node dist/index.js --stdio |
prepublishOnly | npm run lint && npm run type-check && npm run build |
npm run prepublishOnly runs lint, type-check, and build as a single release gate..github/workflows/.Dockerfile and docker-compose.yml are included for containerized runs.@j0hanz/fetch-url-mcp.| Symptom | Likely Cause | Fix |
|---|---|---|
| Server output mixes with MCP traffic on stdio | Logs going to stdout | Ensure all logging writes to stderr; the server does this by default. |
HTTP mode returns 403 | Host/origin mismatch | Add the domain to ALLOWED_HOSTS or verify loopback bindings. |
| HTTP mode rate limits every request from your proxy | SERVER_TRUST_PROXY not enabled | Enable SERVER_TRUST_PROXY=true when the server is behind a trusted reverse proxy. |
HTTP mode returns 401 | Missing or invalid token | Set ACCESS_TOKENS or configure OAuth env vars for remote bindings. |
| Fetch returns private-IP error | SSRF protections blocked the target | Set ALLOW_LOCAL_FETCH=true if the target is intentionally local. |
truncated: true in response | Content exceeded inline limits | Increase MAX_INLINE_CONTENT_CHARS or accept truncated output. |
| Transform timeout or worker crash | Large or complex HTML | Tune TRANSFORM_WORKER_MAX_OLD_GENERATION_MB or increase FETCH_TIMEOUT_MS. |
| Client config not working | Wrong config format for the client | Check the matching <details> block above — config keys vary by client. |
| Dependency | Registry |
|---|---|
| @modelcontextprotocol/server | npm |
| @modelcontextprotocol/node | npm |
| @mozilla/readability | npm |
| linkedom | npm |
| node-html-markdown | npm |
| undici | npm |
| zod | npm |
Pull requests welcome. Please make sure these pass before submitting:
npm run lint and npm run type-checknpm testnpm run formatMIT License. See LICENSE for details.
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.