How to Use the DuckDB MCP Server

The DuckDB MCP server revolutionizes data analysis by bridging the gap between DuckDB, a high-performance in-process SQL database, and AI-driven tools like Claude, Cursor, or other IDEs supporting the Model Context Protocol (MCP). This server enables seamless interaction between your database and AI assistants, allowing you to execute SQL queries, inspect schemas, and build data pipelines with minimal manual intervention. Whether you're analyzing local datasets or leveraging cloud-based MotherDuck instances, the DuckDB MCP server empowers data engineers, analysts, and developers to accelerate workflows and derive insights faster.

💡

To streamline your API-driven integrations with the MCP server, download Apidog for free. Apidog simplifies testing and managing APIs, ensuring smooth communication between your tools and the DuckDB MCP server, enhancing your data project efficiency.

button

Understanding the DuckDB MCP Server

DuckDB is an open-source, in-process SQL database designed for analytical workloads. It supports a wide range of data formats, including CSV, Parquet, and JSON, and can query both local files and remote sources like AWS S3. The DuckDB MCP server, developed by the MotherDuck team, extends DuckDB's capabilities by integrating it with the Model Context Protocol (MCP). This protocol enables AI tools to interact directly with DuckDB databases, supporting operations like querying, schema exploration, and table management through a standardized interface.

The server acts as a bridge between DuckDB and AI-driven environments, such as Cursor or Claude, allowing these tools to execute SQL commands and return results in real time. It supports both local DuckDB instances and cloud-based MotherDuck databases, offering flexibility for diverse use cases. Additionally, the DuckDB MCP server includes security features like read-only mode and read-scaling tokens, ensuring safe integration with third-party tools.

Key Benefits of the DuckDB MCP Server

Here’s why the DuckDB MCP server is a game-changer for data professionals:

AI-Driven Workflows: Integrates with AI assistants to execute natural language queries or generate SQL code, reducing manual effort.
High Performance: Leverages DuckDB’s columnar-vectorized query engine for fast processing of large datasets.
Flexible Data Access: Queries local files, in-memory databases, or cloud storage like S3, with seamless MotherDuck integration.
Enhanced Security: Supports read-only mode and read-scaling tokens to protect data integrity.
Streamlined Development: Simplifies data pipeline creation with tools like dbt, Cursor, or VS Code extensions.

These features make the DuckDB MCP server an essential tool for modern data engineering and analysis.

Setting Up the DuckDB MCP Server

To harness the power of the DuckDB MCP server, you need to install and configure it correctly. This section walks you through the process step-by-step.

Prerequisites

Before you begin, ensure you have the following:

IDE Support: A compatible IDE like Cursor or VS Code with MCP extensions installed.
Python Environment: Python 3.8 or higher with pip or uv for package management.
MotherDuck Account (Optional): Required for cloud-based database access. Sign up at MotherDuck.

C++11 Compiler: Necessary if building DuckDB from source (not required for binary installations).
DuckDB CLI or Library: Ensure DuckDB is installed for local database operations.

Installing DuckDB

DuckDB is the foundation of the DuckDB MCP server. Install it using one of the following methods:

Using Homebrew (macOS):

brew install duckdb

Using pip (Python):

pip install duckdb

Download Binary: Visit DuckDB’s installation page for pre-built binaries for Windows, Linux, or macOS.

Verify the installation by running:

duckdb --version

This should display the DuckDB version, confirming a successful setup.

Installing the MCP Server

The DuckDB MCP server is distributed via the mcp-server-motherduck package. Install it using uv (preferred for Python CLI tools) or pip:

Install uv (Optional):

pip install uv

Install the MCP Server:

uv pip install mcp-server-motherduck

Or, with pip:

pip install mcp-server-motherduck

Verify Installation:

uvx mcp-server-motherduck --version

This command should return the server version, indicating a successful installation.

Configuring the MCP Server

To integrate the DuckDB MCP server with your IDE or AI tool, configure it using a JSON file. Below is an example for Cursor:

Access Cursor Settings: Navigate to Settings > Cursor Settings > MCP > Add a new Global MCP Server.

Create Configuration: Add the following JSON configuration:

{
  "mcpServers": {
    "mcp-server-motherduck": {
      "command": "uvx",
      "args": [
        "mcp-server-motherduck",
        "--db-path",
        "md:",
        "--motherduck-token",
        "<YOUR_MOTHERDUCK_TOKEN_HERE>"
      ]
    }
  }
}

Replace <YOUR_MOTHERDUCK_TOKEN_HERE> with your MotherDuck token for cloud access. For local databases, use:

{
  "mcpServers": {
    "mcp-server-motherduck": {
      "command": "uvx",
      "args": [
        "mcp-server-motherduck",
        "--db-path",
        "/path/to/your/local.db"
      ]
    }
  }
}

For in-memory databases, use --db-path :memory:.

Enable Read-Only Mode (Optional): To prevent data modifications, add the --read-only flag:

{
  "mcpServers": {
    "mcp-server-motherduck": {
      "command": "uvx",
      "args": [
        "mcp-server-motherduck",
        "--db-path",
        "/path/to/your/local.db",
        "--read-only"
      ]
    }
  }
}

Test the Configuration: Restart your IDE and run a test query:

SELECT 1 AS test;

If successful, the server returns the result, confirming the setup.

Practical Use Cases for the DuckDB MCP Server

The DuckDB MCP server shines in real-world data workflows. Below are detailed examples of how to leverage it for analysis, pipeline development, and schema exploration.

Querying Data with AI Assistance

The DuckDB MCP server enables AI tools to execute SQL queries and process results dynamically. For example, suppose you have a Parquet file stored on AWS S3 containing sales data. Here’s how to query it using Cursor:

Configure S3 Access: Ensure your MCP server configuration includes S3 credentials or uses MotherDuck’s cloud integration (--db-path md:).

Prompt the AI: In Cursor, enter a prompt like, “Show the top 10 sales records from my Parquet file at s3://my-bucket/sales.parquet, ordered by revenue.”

Execute the Query: The AI generates and runs:

SELECT * FROM read_parquet('s3://my-bucket/sales.parquet')
ORDER BY revenue DESC
LIMIT 10;

The DuckDB MCP server processes the query and returns results to the AI, which can summarize them or suggest further analysis, such as grouping by region.

Iterate on Results: Prompt the AI to refine the query, e.g., “Group the sales data by region and calculate total revenue.” The AI generates:

SELECT region, SUM(revenue) AS total_revenue
FROM read_parquet('s3://my-bucket/sales.parquet')
GROUP BY region;

This workflow reduces manual query writing and accelerates data exploration.

Building Data Pipelines with dbt

The DuckDB MCP server integrates seamlessly with dbt, a popular tool for data transformation. Here’s how to build and test dbt models using the server:

Initialize a dbt Project: Create a new dbt project and configure the profiles.yml file to use DuckDB or MotherDuck:

my_project:
  target: dev
  outputs:
    dev:
      type: duckdb
      path: /path/to/your/local.db
      # Or for MotherDuck:
      # path: md:
      # token: <YOUR_MOTHERDUCK_TOKEN>

Create Staging Models: Prompt the AI to generate a staging model based on your data. For example:

-- models/staging/stg_sales.sql
SELECT
  order_id,
  customer_id,
  sale_date,
  revenue
FROM read_parquet('s3://my-bucket/sales.parquet')
WHERE sale_date >= '2025-01-01';

Add Tests: Define tests in the model’s .yml file to ensure data quality:

version: 2
models:
  - name: stg_sales
    columns:
      - name: order_id
        tests:
          - not_null
          - unique
      - name: revenue
        tests:
          - positive_values

Run and Test: Execute dbt run to build the model and dbt test to validate it. The DuckDB MCP server ensures efficient query execution, and the AI can suggest optimizations based on test results.

Iterate with AI: Use the AI to refine models, e.g., “Add a column for year-over-year revenue growth.” The AI generates the necessary SQL, which you can incorporate into your dbt project.

Schema Inspection and Data Exploration

The DuckDB MCP server supports schema inspection, enabling AI tools to understand your database structure. For example:

Prompt the AI: Ask, “Describe the schema of my table in sales.db.”

Generated Query: The AI runs:

DESCRIBE SELECT * FROM 'sales.db'.sales_table;

The server returns column names, data types, and constraints, which the AI uses to suggest transformations, joins, or aggregations.

Explore Relationships: Prompt the AI to identify relationships, e.g., “Find tables in my database with a customer_id column.” The server executes a series of DESCRIBE queries across tables, and the AI compiles the results.

This capability is invaluable for understanding complex datasets and planning transformations.

Advanced Features of the DuckDB MCP Server

The DuckDB MCP server offers advanced features to enhance performance, security, and flexibility.

Read-Only Mode and Security

To protect sensitive data, run the DuckDB MCP server in read-only mode using the --read-only flag. This prevents AI tools from executing INSERT, UPDATE, or DELETE queries, ensuring data integrity. For MotherDuck users, read-scaling tokens allow up to four concurrent read replicas, improving performance for multiple users. Configure this in your JSON file:

{
  "mcpServers": {
    "mcp-server-motherduck": {
      "command": "uvx",
      "args": [
        "mcp-server-motherduck",
        "--db-path",
        "md:",
        "--motherduck-token",
        "<YOUR_READ_SCALING_TOKEN_HERE>",
        "--saas-mode"
      ]
    }
  }
}

SaaS mode restricts access to local files and extensions, making it ideal for secure cloud deployments.

Cloud Storage Integration

The DuckDB MCP server supports querying data from cloud storage platforms like AWS S3, Google Cloud Storage, or Cloudflare R2. For example, to query an S3 file:

SELECT
  customer_id,
  AVG(revenue) AS avg_revenue
FROM read_parquet('s3://my-bucket/sales.parquet')
GROUP BY customer_id;

The server handles authentication and data retrieval, allowing AI tools to process results without manual setup.

Short-Lived Connections

For workflows involving multiple tools (e.g., dbt, Cursor, and Jupyter notebooks), the DuckDB MCP server supports short-lived connections in read-only mode. Each query creates a temporary connection, executes, and closes, preventing lock conflicts and improving concurrency. This is particularly useful for iterative exploration or multi-user environments.

Custom Extensions

DuckDB supports extensions for advanced functionality, such as full-text search or geospatial queries. The DuckDB MCP server can load these extensions, provided they are compatible with your database. For example, to enable the httpfs extension for S3 access:

INSTALL httpfs;
LOAD httpfs;

Configure the server to include extensions in the --extensions flag if needed.

Optimizing Performance with the DuckDB MCP Server

To ensure the DuckDB MCP server performs efficiently, consider these optimizations:

Adjust Thread Count: Use the --threads flag to allocate more CPU resources for large datasets, e.g., --threads 8.
Memory Management: Set memory limits with --max-memory to prevent crashes on resource-constrained systems.
Indexing: Create indexes on frequently queried columns to speed up filters and joins.
Partition Data: For cloud storage, partition Parquet files by common filter columns (e.g., date or region) to reduce I/O.
Use MotherDuck for Scalability: Offload large datasets to MotherDuck for better performance and collaboration.

Troubleshooting Common Issues

If you encounter problems with the DuckDB MCP server, try these solutions:

Server Fails to Start: Verify that uvx or the server binary is in your PATH. Ensure Python 3.8+ is installed.
Connection Issues: Check the database path or MotherDuck token. For local databases, confirm file permissions.
Query Errors: Validate SQL syntax and ensure it aligns with DuckDB’s dialect. Use the MCP Inspector (npx @modelcontextprotocol/inspector) for debugging.
Performance Bottlenecks: Increase thread count or memory limits. For large datasets, consider MotherDuck for cloud-based processing.
AI Misinterpretation: If the AI generates incorrect queries, refine your prompt with more context, e.g., “Use DuckDB syntax for grouping by date.”

Conclusion

The DuckDB MCP server is a powerful tool for integrating AI-driven workflows with DuckDB’s high-performance database. By enabling seamless communication between AI tools and databases, it simplifies data analysis, pipeline development, and schema exploration. This guide has covered installation, configuration, practical use cases, advanced features, and best practices to help you leverage the server effectively.

To enhance your experience with API-driven integrations, download Apidog for free. Apidog streamlines testing and managing APIs, ensuring your DuckDB MCP server setup runs smoothly. Start exploring the server today to transform your data workflows and unlock new possibilities for AI-assisted analysis.

button