Server data from the Official MCP Registry
Spatial DOM maps for AI agent browser navigation with anti-bot stealth.
Spatial DOM maps for AI agent browser navigation with anti-bot stealth.
Valid MCP server (3 strong, 5 medium validity signals). 3 known CVEs in dependencies (0 critical, 3 high severity) Package registry verified. Imported from the Official MCP Registry.
4 files analyzed · 4 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-matthewalexong-neo-vision": {
"args": [
"-y",
"neo-vision"
],
"command": "npx"
}
}
}From the project's GitHub README.
See the web the way Neo sees the Matrix.
Give your AI agent a pixel-precise JSON map of every element on a page — coordinates, ARIA roles, accessible labels, and actionability flags — without screenshots, without brittle CSS selectors, without getting blocked by anti-bot systems.
Version 0.6.0 · MIT License · GitHub
AI agents navigating the web today are stuck between two bad options:
Meanwhile, anti-bot systems block headless browsers on sight. So even if you solve the navigation problem, you can't get past the front door of sites like Yelp, LinkedIn, or Zillow.
NeoVision asks the browser's own layout engine where everything is — because it already knows. Like Neo seeing through the green code to perceive the real world, NeoVision reads the raw DOM but gives your agent a spatial map with ground-truth pixel coordinates, straight from the rendering engine.
{
"tag": "button",
"role": "button",
"label": "Sign in",
"bounds": { "x": 305, "y": 510, "width": 74, "height": 36 },
"click_center": { "x": 342, "y": 528 },
"actionable": true
}
No guessing. No hallucination. No selector that breaks tomorrow.
NeoVision drives the user's real Chrome via a Chrome extension — with real cookies, real fingerprint, real browsing history. Anti-bot systems see a real user because it is a real browser.
We tested this against the five most notoriously anti-bot sites on the web — Ticketmaster, Nike, LinkedIn, Instagram, and Amazon — plus Discord (Cloudflare). All six returned full page content with zero CAPTCHAs, zero bot walls, and zero detection signals. Full test report →
chrome://extensionsextension/ folder from this repo{
"mcpServers": {
"neo-vision": {
"command": "npx",
"args": ["neo-vision"]
}
}
}
The MCP server starts automatically. The Chrome extension connects to it via WebSocket. Once connected, all 15 spatial tools are available.
NeoVision exposes 15 tools through the MCP server, organized by function:
| Tool | Description |
|---|---|
spatial_snapshot | Navigate to a URL and return a spatial DOM map with element coordinates, ARIA roles, and actionability flags. Supports compact and agent output formats. |
spatial_click | Click at specified pixel coordinates. Returns updated spatial map reflecting page state after the click. |
spatial_type | Type text into an element. Supports clear_first to replace existing text and press_enter to submit. |
spatial_scroll | Scroll the page or a specific scrollable container. Returns updated spatial map. |
spatial_query | Filter the cached spatial map by ARIA role, HTML tag, label text, bounding box region, or actionability — without reloading the page. |
| Tool | Description |
|---|---|
spatial_screenshot | Capture a PNG screenshot of the current page as base64-encoded data. |
spatial_navigate | Load a URL without capturing a snapshot — lighter-weight than spatial_snapshot when you just need to navigate. |
spatial_wait | Pause execution for a specified duration. Use for pacing between page navigations. |
spatial_execute_js | Run arbitrary JavaScript in the page context and return the result. Full access to DOM, window, and page variables. |
| Tool | Description |
|---|---|
spatial_get_injectable | Get the NeoVision snapshot logic as injectable JavaScript for external browser contexts. Returns a self-invoking script, an installer, or raw source. |
spatial_pace | Human-like pacing manager for multi-page scraping sessions. Manages randomized delays, periodic breaks, CAPTCHA detection, and automatic slowdown. Prevents anti-bot pattern detection across long scraping runs. |
| Tool | Description |
|---|---|
spatial_connect_cdp | Connect to the user's real Chrome browser via Chrome DevTools Protocol. NeoVision automatically relaunches Chrome with CDP enabled if needed. All spatial tools then operate on the real browser session. |
spatial_disconnect_cdp | Release the CDP connection. The user's Chrome stays open. |
spatial_import_cookies | Import cookies into the browser session. Use to warm up sessions with cookies from the user's real browser for anti-bot bypass. |
spatial_export_cookies | Export cookies from the current browser session. Optionally filter by domain. |
NeoVision uses an extension-only architecture:
AI Agent ←→ MCP Server ←→ WebSocket Bridge ←→ Chrome Extension ←→ Real Chrome
npx neo-vision)Why this matters: Anti-bot systems like DataDome and Cloudflare don't just check browser fingerprints — they build trust scores over time based on browsing history, cookie age, and behavioral patterns. A freshly launched headless browser starts with zero trust. NeoVision inherits the user's full trust score because it drives their actual Chrome session.
If the extension disconnects (e.g. Chrome restarts), the bridge automatically waits 10 seconds and attempts to relaunch Chrome with the extension. This retries up to 3 times before giving up, so brief disruptions are handled transparently.
For always-on usage, install the launchd plist to run the daemon at login:
mkdir -p ~/.neo-vision/logs
cp ai.neovision.daemon.plist ~/Library/LaunchAgents/
launchctl load -w ~/Library/LaunchAgents/ai.neovision.daemon.plist
This starts dist/daemon.js at login, keeps it alive, and logs to ~/.neo-vision/logs/.
spatial_snapshot supports two output formats:
Returns every visible element with pixel coordinates, ARIA roles, accessible labels, and actionability flags. Use for general browsing, UI interaction, and element targeting.
Optimized for AI agent context windows on text-dense pages (Wikipedia, Amazon, long articles). Returns:
spatial_scroll delta to advance to the next viewportUse agent mode on any page that returns 1000+ elements in compact mode, or when you need readable text content for research and summarization.
SpatialElementEach element in the map includes:
{
idx: number; // index in the flat array
tag: string; // HTML tag
role: string | null; // ARIA role (explicit or implicit)
label: string | null; // accessible name
text: string | null; // visible text content
bounds: { x, y, width, height }; // absolute pixel coordinates
click_center: { x, y } | null; // where to click (center of bounds)
actionable: boolean; // can this element be interacted with?
input_type: string | null; // for <input>: text, email, password, etc.
focusable: boolean; // can this element receive focus?
selector: string; // CSS selector hint
parent_idx: number | null; // parent element index
computed: { // CSS layout info
position, z_index, display, overflow, opacity
}
}
Given the same HTML + CSS + viewport size + zoom level, browsers produce identical pixel coordinates for every element. This is guaranteed by the W3C CSS specification — it's how browsers paint the screen.
NeoVision locks the viewport, device scale factor, locale, timezone, and scroll position before taking a snapshot. Two independent snapshots of the same page produce byte-identical JSON (excluding the timestamp field).
spatial_pace for human-like session management across hundreds of pagesWorks with any agent framework that supports MCP:
Claude Code / Cowork / Cursor / Windsurf / OpenClaw / AntiGravity
→ Use the MCP server — it just works
For custom integrations, use the injectable script directly:
import { getInjectableScript } from 'neo-vision/injectable';
const script = getInjectableScript({ verbosity: 'actionable' });
// Inject into any browser context via CDP, extension, or DevTools console
MIT
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.