Server data from the Official MCP Registry
Query Spark SQL clusters via Thrift/HiveServer2. Works with Spark, EMR, Hive, Impala.
Query Spark SQL clusters via Thrift/HiveServer2. Works with Spark, EMR, Hive, Impala.
Valid MCP server (1 strong, 5 medium validity signals). 1 code issue detected. 3 known CVEs in dependencies (0 critical, 3 high severity) Package registry verified. Imported from the Official MCP Registry. 1 finding(s) downgraded by scanner intelligence.
12 files analyzed · 5 issues found
Security scores are indicators to help you make informed decisions, not guarantees. Always review permissions before connecting any MCP server.
This plugin requests these system permissions. Most are normal for its category.
Set these up before or after installing:
Environment variable: SPARK_HOST
Environment variable: SPARK_PORT
Environment variable: SPARK_DATABASE
Environment variable: SPARK_AUTH
Environment variable: SPARK_USERNAME
Environment variable: SPARK_PASSWORD
Environment variable: SPARK_KERBEROS_SERVICE_NAME
Add this to your MCP configuration file:
{
"mcpServers": {
"io-github-aidancorrell-spark-sql-mcp-server": {
"env": {
"SPARK_AUTH": "your-spark-auth-here",
"SPARK_HOST": "your-spark-host-here",
"SPARK_PORT": "your-spark-port-here",
"SPARK_DATABASE": "your-spark-database-here",
"SPARK_PASSWORD": "your-spark-password-here",
"SPARK_USERNAME": "your-spark-username-here",
"SPARK_KERBEROS_SERVICE_NAME": "your-spark-kerberos-service-name-here"
},
"args": [
"spark-sql-mcp-server"
],
"command": "uvx"
}
}
}From the project's GitHub README.
An MCP server that enables AI assistants to query Spark SQL clusters via the Thrift/HiveServer2 protocol.
Works with any HiveServer2-compatible system: Apache Spark, AWS EMR, Hive, Impala, Presto.
pip install spark-sql-mcp-server
Or run directly with uvx:
uvx spark-sql-mcp-server
export SPARK_HOST="your-emr-master-node.amazonaws.com"
export SPARK_PORT="10000" # default
export SPARK_DATABASE="default" # default
export SPARK_AUTH="NONE" # NONE | LDAP | KERBEROS | CUSTOM | NOSASL
Global (all projects) — add to ~/.claude.json under your project's mcpServers:
{
"mcpServers": {
"spark-sql": {
"command": "uvx",
"args": ["spark-sql-mcp-server"],
"env": {
"SPARK_HOST": "your-emr-master-node.amazonaws.com",
"SPARK_PORT": "10000",
"SPARK_AUTH": "NONE"
}
}
}
}
Project-level — add to .claude/mcp.json in your repo:
{
"mcpServers": {
"spark-sql": {
"command": "uvx",
"args": ["spark-sql-mcp-server"],
"env": {
"SPARK_HOST": "your-emr-master-node.amazonaws.com",
"SPARK_PORT": "10000",
"SPARK_AUTH": "NONE"
}
}
}
}
Add to your claude_desktop_config.json:
{
"mcpServers": {
"spark-sql": {
"command": "uvx",
"args": ["spark-sql-mcp-server"],
"env": {
"SPARK_HOST": "your-emr-master-node.amazonaws.com",
"SPARK_PORT": "10000"
}
}
}
}
Ask Claude things like:
sales.transactions table"| Tool | Description |
|---|---|
list_databases | List all available databases |
list_tables | List tables in a database |
describe_table | Get table schema (columns, types) |
execute_query | Run read-only SQL queries with formatted results |
export SPARK_AUTH="NONE"
export SPARK_AUTH="LDAP"
export SPARK_USERNAME="your-username"
export SPARK_PASSWORD="your-password"
export SPARK_AUTH="KERBEROS"
export SPARK_KERBEROS_SERVICE_NAME="hive" # default
# Ensure you have a valid Kerberos ticket (kinit)
ssh -i your-key.pem -L 10000:localhost:10000 hadoop@your-emr-master
SPARK_HOST=localhostgit clone https://github.com/aidancorrell/spark-sql-mcp-server.git
cd spark-sql-mcp-server
pip install -e ".[dev]"
pytest
ruff check .
A Docker Compose setup provides a local Spark Thrift Server with sample data for integration testing.
# Start the Spark Thrift Server
cd docker && docker compose up -d
# Wait for it to be ready (takes ~30s on first start)
docker logs -f spark-thrift-server # look for "Sample data loaded."
# Run integration tests
pytest -m integration -v
# Tear down
cd docker && docker compose down -v
The local server comes with sample tables: default.employees, default.orders, and test_db.metrics.
Unit tests run by default with pytest (integration tests are skipped unless -m integration is specified).
With the Docker Spark server running, add it to your MCP config to test the server interactively.
Global — add to ~/.claude.json under your project's mcpServers:
{
"spark-sql": {
"command": "uvx",
"args": ["spark-sql-mcp-server"],
"env": {
"SPARK_HOST": "localhost",
"SPARK_PORT": "10000",
"SPARK_AUTH": "NONE"
}
}
}
Project-level — add to .claude/mcp.json:
{
"mcpServers": {
"spark-sql": {
"command": "uvx",
"args": ["spark-sql-mcp-server"],
"env": {
"SPARK_HOST": "localhost",
"SPARK_PORT": "10000",
"SPARK_AUTH": "NONE"
}
}
}
}
Then start a new Claude Code session and ask it to query the sample data.
The execute_query tool only allows read-only SQL statements. Queries must start with one of: SELECT, SHOW, DESCRIBE, DESC, EXPLAIN, or WITH. All other statement types (DROP, INSERT, DELETE, CREATE, ALTER, SET, ADD JAR, etc.) are rejected before reaching the Spark cluster.
Database errors are sanitized before being returned to the MCP client. Internal details such as server hostnames, file paths, and stack traces are not exposed. Connection failures report only the target host/port and error type.
SparkConfig object masks passwords in its string representationSPARK_PASSWORD is marked as a secret in the MCP registry schemaSPARK_AUTH to LDAP or KERBEROS for authenticated environments.MIT
Be the first to review this server!
by Modelcontextprotocol · Developer Tools
Read, search, and manipulate Git repositories programmatically
by Toleno · Developer Tools
Toleno Network MCP Server — Manage your Toleno mining account with Claude AI using natural language.
by mcp-marketplace · Developer Tools
Create, build, and publish Python MCP servers to PyPI — conversationally.
by Microsoft · Content & Media
Convert files (PDF, Word, Excel, images, audio) to Markdown for LLM consumption
by mcp-marketplace · Developer Tools
Scaffold, build, and publish TypeScript MCP servers to npm — conversationally
by mcp-marketplace · Finance
Free stock data and market news for any MCP-compatible AI assistant.