How to use SiteMCP and Turn Any Website into a MCP Server

Enter SiteMCP, an innovative tool that allows you to transform virtually any website into a Model Context Protocol (MCP) server, enabling AI assistants like Claude to directly access and reference web content.

INEZA FELIN-MICHEL

INEZA FELIN-MICHEL

25 June 2025

How to use SiteMCP and Turn Any Website into a MCP Server

In today's AI-driven world, the ability to feed external knowledge to large language models (LLMs) has become increasingly important. Whether you're a developer, content creator, or AI enthusiast, having your models access specific information can dramatically improve their responses. Enter SiteMCP, an innovative tool that allows you to transform virtually any website into a Model Context Protocol (MCP) server, enabling AI assistants like Claude to directly access and reference web content.

What is SiteMCP?

SiteMCP is a powerful utility that fetches an entire website and turns it into an MCP server. Developed by ryoppippi, this tool bridges the gap between web content and AI models by making websites accessible through the Model Context Protocol framework. It's essentially a solution that enables LLMs to read and reference websites that don't natively support MCP or haven't provided specific integration methods.

Credit: SiteMCP was created by ryoppippi. I encourage you to check out his GitHub project at https://github.com/ryoppippi/sitemcp to support his work and stay updated with the latest features and developments

Please Checkout the SiteMCP Github repo

What is Model Context Protocol (MCP)?

Before diving deeper into SiteMCP, let's understand what MCP actually is. MCP stands for "Model Context Protocol," a system that allows AI assistants to access external data sources. In simple terms, it's a protocol that enables you to tell an AI, "Please read this website" or "Check this file," and have the AI actually retrieve and process that information.

MCP serves as a bridge between AI models and external knowledge sources, making interactions more informed and contextually relevant. Without MCP, AI assistants would be limited to information they were trained on, potentially missing the latest developments or specific content you want them to reference.

Why SiteMCP Matters

SiteMCP solves several critical challenges:

  1. Access to Unprovided Information: Many websites don't offer MCP servers or compatibility with AI tools. SiteMCP circumvents this limitation.
  2. Reduced Token Consumption: Rather than feeding entire websites into a prompt (which consumes precious tokens), SiteMCP allows AIs to access only what they need when they need it.
  3. Up-to-date Information: Access the most current documentation, especially for fast-changing technologies and libraries.
  4. Customized Knowledge Base: Make your personal websites, documentation, or knowledge bases accessible to AI assistants.

Getting Started with SiteMCP

Installation Options

SiteMCP offers flexible installation options depending on your preferences:

For One-off Usage:

# Choose one of the following:
bunx sitemcp
npx sitemcp
pnpx sitemcp

For Global Installation:

# Choose one of the following:
bun i -g sitemcp
npm i -g sitemcp
pnpm i -g sitemcp

Basic Usage

Using SiteMCP is remarkably straightforward. The simplest command follows this pattern:

sitemcp https://example.com

This will fetch the entire website at example.com and create an MCP server for it. For better performance with larger sites, you can adjust concurrency:

sitemcp https://example.com --concurrency 10

Advanced Configuration Options

SiteMCP offers several customization options to fine-tune how websites are processed and served:

Tool Name Strategy

The tool name strategy determines how the MCP server names are generated. This is set using the -t or --tool-name-strategy flag:

# Use domain as the tool name
sitemcp https://vite.dev -t domain

# Use subdomain as the tool name
sitemcp https://react-tweet.vercel.app/ -t subdomain

# Use pathname as the tool name (default)
sitemcp https://ryoppippi.github.io/vite-plugin-favicons/ -t pathname
# Results in: indexOfVitePluginFavicons / getDocumentOfVitePluginFavicons

Matching Specific Pages

For large websites, you might want to limit which pages are fetched. The -m or --match flag lets you specify patterns:

sitemcp https://vite.dev -m "/guide/**" "/blog/**"

This will only fetch pages that match the specified patterns, saving processing time and resources. The matching is powered by micromatch, offering powerful pattern matching capabilities.

Content Selector

SiteMCP uses Mozilla's readability to extract meaningful content from web pages. However, sometimes this automatic extraction might not capture the right content. In such cases, you can specify a CSS selector:

sitemcp https://vite.dev --content-selector ".content"

Caching Mechanism

SiteMCP caches fetched pages in ~/.cache/sitemcp by default, which speeds up subsequent runs. If you need fresh content each time, you can disable caching:

sitemcp https://example.com --no-cache

Integrating SiteMCP with MCP Clients

The real power of SiteMCP comes when integrated with MCP-compatible AI clients. Let's explore how to set this up with Claude Desktop, a popular AI assistant:

Claude Desktop Configuration

To configure Claude Desktop to use your SiteMCP server, add the following to your configuration file:

{
  "mcpServers": {
    "daisy-ui": {
      "command": "npx",
      "args": [
        "-y",
        "sitemcp",
        "https://daisyui.com",
        "-m",
        "/components/**"
      ]
    }
  }
}

This configuration tells Claude Desktop to set up an MCP server named "daisy-ui" that provides access to the DaisyUI components documentation. When you restart Claude Desktop, it will automatically launch the SiteMCP server when needed.

Practical Use Cases

Library Documentation Access

One of the most powerful uses of SiteMCP is providing AI assistants with access to library documentation:

{
  "mcpServers": {
    "svelte": {
      "command": "npx",
      "args": [
        "-y",
        "sitemcp@latest",
        "https://svelte.dev",
        "-m",
        "/docs/**"
      ]
    }
  }
}

This configuration enables your AI to reference the latest Svelte documentation, ensuring that code suggestions and explanations reflect current best practices rather than outdated information the AI might have learned during training.

Personal Website Integration

You can also make your personal website available to AIs:

{
  "mcpServers": {
    "my-blog": {
      "command": "npx",
      "args": [
        "-y",
        "sitemcp@latest",
        "https://yourblog.com"
      ]
    }
  }
}

This allows AIs to reference your writing style, past articles, or personal documentation, making their responses more tailored to your specific context.

Understanding How SiteMCP Works

SiteMCP operates through a clever two-server architecture:

  1. Index Server: Provides a list of available pages with their titles and URLs.
  2. Document Server: Retrieves the actual content of specific pages when requested.

This approach allows the AI to first understand what information is available and then selectively retrieve only what it needs, significantly reducing token usage compared to providing all information at once.

When a page is particularly long, SiteMCP implements pagination to ensure reliable access, as some AI models might struggle with extremely large documents.

Troubleshooting Common Issues

Long Tool Names

Some users have encountered issues with tool names exceeding the 64-character limit in certain MCP clients. The latest version (v0.3.0 and above) has addressed this issue, but if you experience similar problems, updating to the latest version is recommended.

Server Communication Errors

If you encounter JSONRPC errors such as {"jsonrpc":"2.0","id":XX,"error":{"code":-32601,"message":"Method not found"}}, ensure you're using the latest version of SiteMCP, which includes fixes for compatibility with various MCP clients.

Performance Considerations

For very large websites, consider using the match parameter to limit which pages are fetched:

sitemcp https://large-documentation-site.com -m "/get-started/**" "/api/**"

This can dramatically improve performance and reduce resource usage.

Advanced SiteMCP Applications

Creating Custom Knowledge Bases

Beyond existing websites, you can use SiteMCP to create custom knowledge bases by pointing it at locally served content:

# First serve your local documentation
npx serve ./my-docs

# Then in another terminal, create an MCP server from it
sitemcp http://localhost:3000

Combining Multiple Knowledge Sources

You can configure multiple SiteMCP servers in your MCP client to provide the AI with access to diverse information sources:

{
  "mcpServers": {
    "technical-docs": {
      "command": "npx",
      "args": ["-y", "sitemcp@latest", "https://docs.example.com"]
    },
    "company-blog": {
      "command": "npx",
      "args": ["-y", "sitemcp@latest", "https://blog.example.com"]
    }
  }
}

Conclusion

SiteMCP offers an elegant solution to one of the most common challenges in AI interactions: providing specific external knowledge to AI models. By transforming any website into an MCP server, it bridges the gap between web content and AI capabilities, enabling more informed, accurate, and contextually relevant AI responses.

Whether you're a developer looking to provide your AI with access to specific documentation, a content creator wanting your AI to reference your work, or simply an AI enthusiast seeking to expand what your assistant can do, SiteMCP offers a straightforward way to enhance AI-human collaboration.

As the AI landscape continues to evolve, tools like SiteMCP that facilitate the flow of information between web resources and AI models will become increasingly valuable. By mastering SiteMCP, you're not just learning a tool – you're embracing a new paradigm of AI interaction that leverages the vast knowledge of the web to make AI assistants more helpful, accurate, and contextually aware.

Explore more

Why API Documentation Is Essential

Why API Documentation Is Essential

Discover why API documentation matters in this technical guide. Learn its key role in developer success, best practices, and how Apidog simplifies creating clear, user-friendly documentation

1 July 2025

How to Get Started with PostHog MCP Server

How to Get Started with PostHog MCP Server

Discover how to install PostHog MCP Server on Cline in VS Code/Cursor, automate analytics with natural language, and see why PostHog outshines Google Analytics!

30 June 2025

A Developer's Guide to the OpenAI Deep Research API

A Developer's Guide to the OpenAI Deep Research API

In the age of information overload, the ability to conduct fast, accurate, and comprehensive research is a superpower. Developers, analysts, and strategists spend countless hours sifting through documents, verifying sources, and synthesizing findings. What if you could automate this entire workflow? OpenAI's Deep Research API is a significant step in that direction, offering a powerful tool to transform high-level questions into structured, citation-rich reports. The Deep Research API isn't jus

27 June 2025

Practice API Design-first in Apidog

Discover an easier way to build and use APIs