How to Generate LLMs.txt Files Using Firecrawl MCP

Discover how Cline and Firecrawl MCP integrate to automate web scraping and generate LLM-ready text files for AI analysis and training.

Ashley Goolam

Ashley Goolam

19 June 2025

How to Generate LLMs.txt Files Using Firecrawl MCP

Are you looking to streamline your workflow by integrating AI tools with web scraping capabilities? Cline, an AI assistant in VS Code, combined with Firecrawl MCP, offers a powerful solution for generating LLMs.txt files. In this tutorial, we'll explore how to use Cline with Firecrawl MCP to transform websites into LLM-ready text files.

💡
Want to connect your Cursor AI Coding Workflow with API Documentation? Apidog MCP server is here to help you gain the full-scale vibe coding experience! Simply feed your API specification directly into Cursor and watch the magic!

While working with AI IDEs such as Cursor, supercharge your API workflow with Apidog! This free, all-in-one platform lets you design, test, mock, and document APIs in a single interface. So why not try it out now? 👇👇

button

Introduction to Cline and Firecrawl MCP

Cline:

Cline is an AI assistant that leverages the Model Context Protocol (MCP) to extend its capabilities. It can create and manage custom tools, including MCP servers, directly within VS Code. Cline supports various AI models and APIs, allowing you to automate complex tasks like web scraping and data extraction.

Firecrawl MCP Server:

Firecrawl MCP Server is designed to enhance web scraping capabilities for LLM clients. It supports powerful JavaScript rendering, automatic retries, and efficient batch processing. This server is ideal for extracting structured information from web pages using LLMs.

How to Use Firecrawl to Scrape Web Data (Beginner’s Tutorial)
Unlock web data with Firecrawl—transform websites into structured data for AI applications.

Prerequisites

Setting Up Cline in VS Code

Install Cline Extension:

Open the VS Code Extensions Marketplace and search for "Cline". Click "Install" to add it to your VS Code environment.

add cline to vs code

Configure Cline:

Once installed, you can interact with Cline through the VS Code terminal or chat interface. You can ask Cline to perform tasks like creating new files or executing terminal commands.

Enable MCP Capabilities:

Cline can create and manage MCP servers. Ask Cline to "add a tool" related to Firecrawl MCP, and it will handle the setup process for you.

Setting Up Firecrawl MCP Server with Cline

Using Cline to set up and configure an MCP server like Firecrawl MCP is significantly easier than manual configurations required by other AI tools. Cline offers an MCP marketplace where you can browse through thousands of pre-configured MCP servers, making the process streamlined and user-friendly.

Step 1: Access Cline's MCP Marketplace

Open Cline in VS Code: Start by opening Cline within VS Code. You can interact with Cline through the terminal or chat interface.

Navigate to MCP Servers Marketplace: Go to the MCP Servers Marketplace within Cline. This section is similar to browsing extensions in VS Code, where you can search for and install MCP servers.

cline mcp market place

Step 2: Install Firecrawl MCP Server

Search for Firecrawl MCP: In the marketplace, search for "Firecrawl MCP" and click on it to install.

View Installed MCP Servers: After installation, navigate to the "Installed" section to see the Firecrawl MCP server listed.

add firecraw mcp to cline

Step 3: Configure Firecrawl MCP Server

Obtain Firecrawl API Key: In order to use Firecrawl, you need an API key. Visit the official Firecrawl website, create an account, and obtain a free API key. Save this key securely.

Configure MCP Server: In Cline, click on "Configure MCP Servers." You will see a JSON file where you can add your Firecrawl API key.

configure cline mcp servers

You should see something like this:

{
  "mcpServers": {
    "github.com/mendableai/firecrawl-mcp-server": {
      "command": "cmd",
      "args": [
        "/c",
        "set FIRECRAWL_API_KEY=<Replace with your firecrawl_api_key"fc-"> && npx -y firecrawl-mcp"
      ],
      "env": {
        "FIRECRAWL_API_KEY": <Replace with your firecrawl_api_key"fc-">
      },
      "disabled": false,
      "autoApprove": []
    }
  }
}

Refresh and Verify: After adding the API key, refresh the MCP server. It should now be successfully configured, indicated by a green dot. This means the server is ready for use.

Step 4: Explore Firecrawl MCP Tools

View Available Tools: Click the dropdown button next to the Firecrawl MCP server to view all available tools and their details.

view firecrawl's mcp tools

Other Installed MCP Servers: Below the Firecrawl MCP server, you'll see other MCP servers you've installed from Cline's marketplace.

Managing API Providers in Cline

If you run out of free tries with Cline, you can switch to a different API provider:

Change API Provider: Go to Cline's settings and change the API provider to "VS Code LM API." This allows you to use the Claude 3.5 model integrated with VS Code's Copilot completely free! But this comes with monthly usage limits and isn't always going to work smoothly. However, for this tutorial, you shouldn't have anything to worry about as Cline's free tier will be more than enough for you to get started.

change clines api provider

Install Copilot: to use the Claude 3.5 model with Cline, ensure that you have Copilot installed in your VS Code. If not, update VS Code or simply install Copilot from the Extensions Marketplace.

By leveraging Cline's MCP marketplace and streamlined configuration process, you can quickly set up and start using Firecrawl MCP Server without the hassle of manual setup and configuration.

Generating LLMs.txt Files with Cline and Firecrawl MCP

Ask Cline to Generate LLMs.txt: Interact with Cline in VS Code and ask it to generate LLMs.txt files using Firecrawl MCP. You can provide a URL and specify parameters like maxUrls and whether to generate llms-full.txt.

# Sample input
>> generate an llms.txt from firecraw.dev --short version

Monitor Generation Status: Cline will execute the command to generate LLMs.txt files using the Firecrawl MCP Server. You can monitor the status of the generation process through Cline's output or by checking the Firecrawl MCP Server logs.

view the llms.txt file

Access Generated Files: Once the generation is complete, Cline will provide you with the generated llms.txt and optionally llms-full.txt files. These files are ready for use in training or analyzing LLMs.

Features and Benefits

Efficient Web Scraping: Firecrawl MCP Server offers powerful web scraping capabilities with JavaScript rendering support, ensuring that you can extract data from dynamic web pages efficiently.

Customizable: You can configure the server to handle batch processing with rate limiting, ensuring that your web scraping tasks are both efficient and compliant with website policies.

AI Integration: By integrating with Cline, you can automate the process of generating LLMs.txt files, making it easier to prepare data for AI models.

LLMs.txt file Use Cases

Data Analysis: Use the generated LLMs.txt files to analyze website content, extract key information, and train LLMs for specific tasks.

Research Automation: Automate data collection for research purposes by scraping content from multiple websites and generating LLM-ready text files.

Content Summarization: Leverage the concise summaries in llms.txt to quickly understand the content of websites without manually reviewing each page.

Best Practices while Working with Firecrawl MCP

To ensure that the Firecrawl MCP will run efficiently and can provide you with reliable information while using it with Cline, follow these best practices:

Always Validate URLs Before Processing:

Before sending URLs to Firecrawl MCP for scraping, validate that they are accessible and in the correct format. This prevents errors and wasted API calls.

Use Rate Limiting to Avoid Server Overload:

Implement rate limiting in your Cline configuration or directly within the Firecrawl MCP settings. This ensures that you don't overload the target websites or exceed the API limits, leading to blocking or service disruptions.

Firecrawl MCP supports customizable rate limiting to handle batch processing efficiently.

Regularly Backup Generated Files:

Create a backup strategy for your generated LLMs.txt files. This protects your data against accidental loss or corruption. Store backups in a secure and accessible location.

Monitor API Usage and Limits:

Regularly monitor your Firecrawl API usage to stay within the free tier or paid limits. Set up alerts to notify you when you're approaching the limits to avoid unexpected charges or service interruptions.

Conclusion

Combining Cline with Firecrawl MCP offers a streamlined workflow for generating LLMs.txt files. This integration allows you to automate web scraping tasks, prepare data for AI models, and enhance your productivity in data analysis and research. Whether you're working on content summarization, data extraction, or AI model training, this setup provides the tools you need to succeed.

And while you’re at it, don’t forget to check out Apidog—the ultimate platform for API development that’s making waves as a better alternative to Postman.

button

Explore more

A Developer's Guide to the OpenAI Deep Research API

A Developer's Guide to the OpenAI Deep Research API

In the age of information overload, the ability to conduct fast, accurate, and comprehensive research is a superpower. Developers, analysts, and strategists spend countless hours sifting through documents, verifying sources, and synthesizing findings. What if you could automate this entire workflow? OpenAI's Deep Research API is a significant step in that direction, offering a powerful tool to transform high-level questions into structured, citation-rich reports. The Deep Research API isn't jus

27 June 2025

How to Get Free Gemini 2.5 Pro Access + 1000 Daily Requests (with Google Gemini CLI)

How to Get Free Gemini 2.5 Pro Access + 1000 Daily Requests (with Google Gemini CLI)

Google's free Gemini CLI, the open-source AI agent, rivals its competitors with free access to 1000 requests/day and Gemini 2.5 pro. Explore this complete Gemini CLI setup guide with MCP server integration.

27 June 2025

How to Use MCP Servers in LM Studio

How to Use MCP Servers in LM Studio

The world of local Large Language Models (LLMs) represents a frontier of privacy, control, and customization. For years, developers and enthusiasts have run powerful models on their own hardware, free from the constraints and costs of cloud-based services.However, this freedom often came with a significant limitation: isolation. Local models could reason, but they could not act. With the release of version 0.3.17, LM Studio shatters this barrier by introducing support for the Model Context Proto

26 June 2025

Practice API Design-first in Apidog

Discover an easier way to build and use APIs