Howt to Deploy MCP Servers on AWS Lambda

The Model Context Protocol (MCP) is rapidly emerging as a standard way to empower Large Language Models (LLMs) like Claude, ChatGPT, and others to interact with the outside world. By defining tools in a structured way, MCP allows LLMs to request actions, fetch real-time data, or interact with external APIs, moving beyond their static training data.

However, deploying these MCP "servers" (which provide the tools) often presents a challenge, especially in modern cloud environments. Many initial MCP implementations were designed for local execution, communicating over standard input/output (stdio), or using protocols like Server-Sent Events (SSE) for streaming. While functional, these approaches often rely on persistent connections and stateful behavior, making them awkward fits for scalable, stateless, event-driven platforms like AWS Lambda.

AWS Lambda offers tremendous benefits: automatic scaling, pay-per-use cost efficiency, and zero server management overhead. How can we bridge the gap and run robust, production-ready MCP servers in this serverless environment?

Enter MCPEngine, an open-source Python implementation of MCP specifically designed to address these challenges. MCPEngine supports streamable HTTP alongside SSE, making it fully compatible with AWS Lambda's request/response model. It also bundles essential features for production deployments, including built-in authentication support and streamlined packaging.

This article explores how to leverage MCPEngine to build and deploy MCP servers on AWS Lambda, covering stateless tools, state management, and authentication.

💡

Want to end hallucinations in Cursor?

Apidog-MCP-Server allows Cursor to directly read API Docs, which could be either published online documentation, or even local OpenAPI files.

By making your API design the source of truth for the AI, the Apidog MCP Server facilitates tasks like code generation based on schemas, intelligent searching through endpoints, and ensuring code modifications align perfectly with the API contract, ultimately streamlining the development workflow.

We’re thrilled to share that MCP support is coming soon to Apidog! 🚀

Apidog MCP Server lets you feed API docs directly to Agentic AI, supercharging your vibe coding experience! Whether you're using Cursor, Cline, or Windsurf - it'll make your dev process faster and smoother.… pic.twitter.com/ew8U38mU0K
— Apidog (@ApidogHQ) March 19, 2025

Apidog MCP Server - Apidog Docs

Apidog Docs

Core Concepts: MCPEngine and Lambda

Before diving into deployment, let's understand the key MCPEngine components for Lambda integration:

MCPEngine: The central class orchestrating your tools and handling MCP communication.
@engine.tool() Decorator: Registers a Python function as an MCP tool. The function name becomes the tool's name, and its docstring serves as the description provided to the LLM.
engine.get_lambda_handler(): This method generates an AWS Lambda-compatible handler function. You expose this handler, and MCPEngine takes care of translating Lambda's event payload into MCP requests and formatting the responses.

Building a Simple Stateless Tool

Let's start with the basics: a stateless tool deployed on Lambda. This example provides a simple greeting tool.

Prerequisites:

Python 3.8+
An AWS account with permissions to manage Lambda, ECR, and IAM.
Docker installed locally.
AWS CLI configured.

1. Install MCPEngine:

pip install mcpengine[cli,lambda]

2. Create the Application (app.py):

# app.py
from mcpengine import MCPEngine, Context

# Initialize the engine
engine = MCPEngine()

@engine.tool()
def personalized_greeting(name: str) -> str:
    """
    Generates a friendly greeting for the specified name.
    Use this tool when asked to greet someone.
    """
    # Simple stateless logic
    return f"Hello, {name}! Welcome to the serverless MCP world."

# Get the Lambda handler function
handler = engine.get_lambda_handler()

This code defines a single tool, personalized_greeting, which takes a name and returns a string. The handler variable is what AWS Lambda will invoke.

Deployment Workflow: Code to Cloud

Deploying an MCPEngine application to Lambda involves containerizing it with Docker, pushing it to Amazon Elastic Container Registry (ECR), and configuring the Lambda function.

1. Dockerize the Application (Dockerfile):

# Use the official AWS Lambda Python base image
FROM public.ecr.aws/lambda/python:3.12

# Set the working directory in the container
WORKDIR ${LAMBDA_TASK_ROOT}

# Copy requirements first to leverage Docker cache
COPY requirements.txt .
# Install dependencies (assuming mcpengine is listed in requirements.txt)
# Or install directly: RUN pip install --no-cache-dir mcpengine[cli,lambda]
RUN pip install --no-cache-dir -r requirements.txt

# Copy the rest of the application code
COPY app.py .

# Set the command to run the handler function (app.handler means handler in app.py)
CMD ["app.handler"]

(Ensure you have a requirements.txt file listing mcpengine[cli,lambda] or modify the RUN command accordingly).

2. Build and Push the Docker Image to ECR:

First, create an ECR repository (replace <region> and <repo-name>):

aws ecr create-repository --repository-name <repo-name> --region <region>

Note your AWS Account ID and the repository URI from the output (<account-id>.dkr.ecr.<region>.amazonaws.com/<repo-name>).

Now, build, tag, and push the image:

# Authenticate Docker with ECR
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com

# Build the image (use --platform for cross-architecture builds if needed)
docker build --platform=linux/amd64 -t <repo-name>:latest .

# Tag the image for ECR
docker tag <repo-name>:latest <account-id>.dkr.ecr.<region>.amazonaws.com/<repo-name>:latest

# Push the image to ECR
docker push <account-id>.dkr.ecr.<region>.amazonaws.com/<repo-name>:latest

3. Create and Configure the Lambda Function:

You'll need an IAM execution role for Lambda first. If you don't have one, create a basic one:

# (Simplified - adjust trust policy and permissions as needed)
aws iam create-role --role-name lambda-mcp-role --assume-role-policy-document '{"Version": "2012-10-17","Statement": [{"Effect": "Allow","Principal": {"Service": "lambda.amazonaws.com"},"Action": "sts:AssumeRole"}]}'
aws iam attach-role-policy --role-name lambda-mcp-role --policy-arn arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole

Now, create the Lambda function using the ECR image (replace placeholders):

aws lambda create-function \
  --function-name mcp-greeter-function \
  --package-type Image \
  --code ImageUri=<account-id>.dkr.ecr.<region>.amazonaws.com/<repo-name>:latest \
  --role arn:aws:iam::<account-id>:role/lambda-mcp-role \
  --timeout 30 \
  --memory-size 512 \
  --region <region>

4. Expose via Function URL:

To make the Lambda callable over HTTP without API Gateway, create a Function URL:

aws lambda create-function-url-config \
  --function-name mcp-greeter-function \
  --auth-type NONE \
  --region <region>

# Add permission for public access (adjust if auth is needed)
aws lambda add-permission \
  --function-name mcp-greeter-function \
  --statement-id FunctionURLAllowPublicAccess \
  --action lambda:InvokeFunctionUrl \
  --principal '*' \
  --function-url-auth-type NONE \
  --region <region>

Note the Function URL returned by the create-function-url-config command. Your stateless MCP server is now live!

Managing State with `lifespan` Context

Lambda is stateless, but many tools need access to databases, connection pools, or other resources initialized at startup. MCPEngine addresses this with the lifespan argument, which accepts an async context manager.

The lifespan function runs its setup code (before yield) when the Lambda container starts and its teardown code (after yield) when the container shuts down. The value yielded becomes available in your tool functions via the ctx (Context) object.

Let's build a simple event logger that stores events in an RDS Postgres database.

1. Modify app.py:

# app.py (Stateful Example)
import os
import psycopg2
from contextlib import asynccontextmanager
from mcpengine import MCPEngine, Context

# Assume DB connection details are in environment variables
DB_HOST = os.environ.get("DB_HOST")
DB_USER = os.environ.get("DB_USER")
DB_PASS = os.environ.get("DB_PASS")
DB_NAME = os.environ.get("DB_NAME")

@asynccontextmanager
async def db_connection_manager():
    """Manages the database connection pool."""
    conn = None
    try:
        print("Establishing DB connection...")
        conn = psycopg2.connect(
            host=DB_HOST,
            user=DB_USER,
            password=DB_PASS,
            dbname=DB_NAME
        )
        # Create table if it doesn't exist (simple example)
        with conn.cursor() as cur:
             cur.execute("""
                CREATE TABLE IF NOT EXISTS events (
                    id SERIAL PRIMARY KEY,
                    event_name TEXT NOT NULL,
                    timestamp TIMESTAMP DEFAULT now()
                );
             """)
        conn.commit()
        print("DB connection ready.")
        yield {"db_conn": conn} # Make connection available via ctx.db_conn
    finally:
        if conn:
            print("Closing DB connection.")
            conn.close()

# Initialize engine with the lifespan manager
engine = MCPEngine(lifespan=db_connection_manager)

@engine.tool()
def log_event(event_name: str, ctx: Context) -> str:
    """Logs an event with the given name to the database."""
    try:
        with ctx.db_conn.cursor() as cur:
            cur.execute("INSERT INTO events (event_name) VALUES (%s)", (event_name,))
        ctx.db_conn.commit()
        return f"Event '{event_name}' logged successfully."
    except Exception as e:
        # Basic error handling
        ctx.db_conn.rollback()
        return f"Error logging event: {e}"

@engine.tool()
def get_latest_events(limit: int = 5, ctx: Context) -> list[str]:
    """Retrieves the latest logged events from the database."""
    try:
        with ctx.db_conn.cursor() as cur:
            cur.execute("SELECT event_name, timestamp FROM events ORDER BY timestamp DESC LIMIT %s", (limit,))
            events = [f"[{row[1].strftime('%Y-%m-%d %H:%M:%S')}] {row[0]}" for row in cur.fetchall()]
            return events
    except Exception as e:
        return [f"Error retrieving events: {e}"]


# Get the Lambda handler
handler = engine.get_lambda_handler()

2. Deployment Considerations:

Database: You need an accessible RDS instance (or other database).
Networking: Configure the Lambda function's VPC settings to allow access to the RDS instance (Security Groups, Subnets).
Environment Variables: Pass DB_HOST, DB_USER, DB_PASS, DB_NAME as environment variables to the Lambda function.
IAM: The Lambda execution role might need additional permissions if accessing other AWS services (e.g., Secrets Manager for DB credentials).

Update the Dockerfile if needed (e.g., to install psycopg2-binary), rebuild/push the image, and update the Lambda function's code and configuration (environment variables, VPC settings).

Securing Tools with Authentication

Production tools need authentication. MCPEngine integrates with OpenID Connect (OIDC) providers like Google, AWS Cognito, Auth0, etc.

1. Configure OIDC Provider:
Set up an OAuth client ID with your chosen provider (e.g., Google Cloud Console). You'll need the Client ID and potentially the Client Secret (depending on the flow).

2. Update app.py for Authentication:

# app.py (Authenticated Example - Snippets)
import os
# ... other imports ...
from mcpengine import MCPEngine, Context, GoogleIdpConfig # Or other IdpConfig

# ... db_connection_manager ...

# Configure IDP - using Google as an example
# Assumes GOOGLE_CLIENT_ID is set as an environment variable
google_config = GoogleIdpConfig(
   client_id=os.environ.get("GOOGLE_CLIENT_ID")
   # issuer can often be inferred, or set explicitly
)

# Initialize engine with lifespan and IDP config
engine = MCPEngine(
    lifespan=db_connection_manager,
    idp_config=google_config
)

# Secure the log_event tool
@engine.auth() # Add this decorator
@engine.tool()
def log_event(event_name: str, ctx: Context) -> str:
    """Logs an event with the given name to the database. Requires authentication."""
    # Access authenticated user info if needed: user_email = ctx.user.email
    user_email = ctx.user.email if ctx.user else "unknown"
    print(f"Authenticated user: {user_email}")
    try:
        # ... (database logic remains the same) ...
         return f"Event '{event_name}' logged successfully by {user_email}."
    except Exception as e:
        # ... error handling ...
        return f"Error logging event for {user_email}: {e}"

# get_latest_events can remain unauthenticated or be secured too
@engine.tool()
def get_latest_events(limit: int = 5, ctx: Context) -> list[str]:
   # ... (logic remains the same) ...


# Get the Lambda handler
handler = engine.get_lambda_handler()

Key Changes:

Imported GoogleIdpConfig (or the appropriate one for your provider).
Instantiated MCPEngine with the idp_config argument.
Added the @engine.auth() decorator above @engine.tool() for the function(s) requiring authentication. MCPEngine will automatically reject requests without a valid JWT token verified against the IDP's public keys.
Authenticated user information (from the JWT claims) is available via ctx.user.

3. Deployment:

Pass the necessary environment variables for authentication (e.g., GOOGLE_CLIENT_ID) to your Lambda function.
Rebuild/push the image and update the Lambda function.

Connecting an LLM Client

Once your MCP server is deployed on Lambda with a Function URL, you can connect compatible clients. Using mcpengine proxy is a convenient way to bridge clients like Claude:

mcpengine proxy <your-chosen-service-name> <your-lambda-function-url> --mode http --claude

If using authentication:

mcpengine proxy <your-chosen-service-name> <your-lambda-function-url> \
  --mode http \
  --claude \
  --client-id <your-google-client-id> \
  --client-secret <your-google-client-secret> # Needed for token acquisition flow

This command runs a local proxy that Claude connects to. The proxy then forwards requests over HTTP to your Lambda Function URL, handling the authentication flow if configured. The LLM can now discover and invoke your serverless tools.

Conclusion

Deploying MCP servers on AWS Lambda unlocks incredible scalability and operational efficiency for extending LLM capabilities. Traditional MCP implementations often struggle in stateless environments, but MCPEngine provides a robust, open-source solution. By supporting streamable HTTP, offering context management via lifespan, and integrating seamlessly with OIDC for authentication, MCPEngine makes serverless MCP not just possible, but practical for production use cases. Whether building simple stateless tools or complex, stateful, authenticated applications, MCPEngine combined with AWS Lambda offers a powerful platform for the next generation of AI-powered interactions.

💡

Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demans, and replaces Postman at a much more affordable price!

button