How to Install Use the OpenMemory MCP Server & API

As developers, you're looking for powerful memory solutions that also guarantee data privacy, control, and smooth interoperability between AI tools. The trend clearly favors local-first, developer-focused memory systems. The OpenMemory MCP Server concept, brought to life by technologies like the open-source Mem0 framework, leads this change. It offers a private, local memory layer, letting AI agents remember context across applications while your data stays secure on your machine. This guide is for developers like you. It's a full walkthrough on installing essential components using Mem0, running your own local OpenMemory MCP Server, and skillfully using its REST API to create smarter, context-aware AI applications.

Let's start unlocking persistent, local AI memory.

💡

Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demands, and replaces Postman at a much more affordable price!

button

Step 1: Preparing Your Development Environment

Before diving into the installation, it's essential to have your basic development tools ready and understand the core components we'll be working with. We'll primarily use the versatile Mem0 open-source framework to establish and then interact with our local OpenMemory MCP Server. This interaction will largely happen via its REST API, providing a standardized way to manage your AI's memory.

You'll need a recent Python environment, preferably version 3.8 or newer, to ensure full compatibility with Mem0 and its dependencies. You can download the latest Python from python.org. Along with Python, you'll need pip, the Python package installer, which is crucial for installing Mem0 and typically comes bundled with Python installations. The mem0ai Python library will be our engine; its open-source nature allows you to run a local server that effectively becomes your OpenMemory MCP Server. This entire setup prioritizes running the memory server on your local machine, giving you full control over your data and operations—a key principle of the OpenMemory philosophy. The server then exposes its functionalities through a REST API, with endpoints for adding, searching, and managing memories that align with the standard operations defined by the Model Context Protocol (MCP).

Pro Tip: Using virtual environments (like venv or conda) is a smart move. It helps manage project dependencies and prevents clashes with other Python projects.

Step 2: How to Install the Mem0 Framework

Installing the mem0ai package is your first practical step towards having a local AI memory server. This is done with a simple command that equips your system with the core Mem0 library and all the necessary components.

Open your terminal (and activate your virtual environment if you use one). Then, run the installation command:

pip install mem0ai

This command installs the mem0ai library, which includes the core Memory class for direct memory management in Python. You also get support for various Large Language Models (LLMs) and vector databases, plus the FastAPI and Uvicorn components needed to run the Mem0 REST API server. After the installation completes, you can verify it by checking the installed package version with pip show mem0ai, or by trying a quick import in a Python session:

import mem0
print(mem0.__version__)

pip also automatically handles installing required dependencies like FastAPI (for building the API), Uvicorn (as the ASGI server), and various data handling libraries. With mem0ai installed, you have the foundational software. You can use the library directly in Python scripts for memory operations, but for the "server" aspect that MCP clients can interact with, we'll use its built-in server functionality.

Step 3: How to Configure Your Local Memory Server

Before starting the server, you have the option to configure its underlying Mem0 instance. This configuration is key because it determines which Large Language Model (LLM), vector database, and embedding model will power your server's memory processing and retrieval capabilities—essentially, you're customizing its "brain."

The Mem0 server (and the Memory class it's built upon) is primarily configured using environment variables. This is a flexible approach that lets you set up your backend without modifying any code. For example, to configure the LLM, you might set OPENAI_API_KEY for OpenAI, ANTHROPIC_API_KEY for Anthropic, or variables like MEM0_LLM_PROVIDER="ollama" for local models via Ollama. Similarly, for the vector database, while Mem0 defaults to LanceDB (which is local and requires no extra setup), you can specify alternatives like Qdrant using variables such as MEM0_VECTOR_STORE_PROVIDER="qdrant". Embedding model configurations are often tied to your LLM provider but can also be specified independently.

If you don't set any specific environment variables, Mem0 attempts to use sensible defaults. However, it might require an OpenAI key for its default LLM if not otherwise specified. These choices significantly impact your server: the LLM affects how memories are understood, while the vector database influences storage efficiency and search speed. For a comprehensive list of all configurable environment variables and options for different LLMs, vector stores, and embedders, always consult the official Mem0 documentation.

Pro Tip: Begin with a simple local setup (like Ollama for LLM and the default LanceDB). This helps you get comfortable with the system. Then, explore more advanced setups as your needs evolve.

Step 4: How to Run Your OpenMemory MCP Server

With mem0ai installed and (optionally) configured to your liking, you can now start the local server process. This server is what exposes the REST API endpoints, allowing other applications or scripts to perform memory operations.

The mem0ai package offers a handy command-line script to run the server. In your terminal, simply type:

mem0-server

You can also specify a custom host and port if needed, for example: mem0-server --host 127.0.0.1 --port 8000. The default is typically 0.0.0.0 (all available network interfaces) on port 8000. Once started, Uvicorn (the server running your FastAPI application) will output logs to the console, indicating that the server is running and listening for requests. You'll see messages like: INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit).

Your OpenMemory MCP Server is now running locally! MCP-compatible clients, if configured to point to http://localhost:8000 (or your custom address), can interact with it. You can also directly use its REST API, which we'll explore next. To stop the server, simply press CTRL+C in the terminal where it's running. The mem0-server command essentially launches a Uvicorn ASGI server, which runs the FastAPI application defined within Mem0. This application hosts the REST API endpoints, and this running instance is your local OpenMemory MCP Server.

Step 5: Understanding the Server's API Capabilities

The Mem0 server provides a comprehensive REST API that allows you to programmatically manage AI memories. This API adheres to the core principles of the Model Context Protocol by offering standardized ways to add, search, list, and delete information.

The API uses standard HTTP/HTTPS protocols, and JSON is the data format for both request and response bodies. If you're running the server locally with default settings, your base URL will be http://localhost:8000. The core operations available through the API (based on typical Mem0 REST API implementations) include POST /add to store new information, POST /search to retrieve relevant memories, GET /get_all to list multiple memories (with filter options), GET /get to retrieve a single memory by its ID, PUT /update to modify an existing memory, DELETE /delete to remove a specific memory, and DELETE /delete_all to clear memories, potentially filtered by user.

By default, the locally run mem0-server does not implement authentication, as it's primarily intended for local development and use. If you were to consider production or exposed deployments (which is not recommended without significant additional security layers), authentication would be a critical aspect to implement.

Pro Tip: You can often explore the API interactively. Try visiting http://localhost:8000/docs in your browser once the server is up. Mem0's FastAPI server usually offers this helpful documentation page through Swagger UI.

Step 6: How to Add Memories Using the API

The /add endpoint is your primary tool for ingesting data into the memory server. It allows you to store textual information, associate it with a specific user, and include optional metadata for better organization and retrieval.

To use it, you send a POST request to the /add endpoint. The request body should be JSON and include data (the string content of the memory), user_id (an identifier for the user associated with this memory – crucial for organizing and retrieving user-specific memories), and optionally metadata (a JSON object containing any additional key-value pairs you want to store, like {"source": "chat_log"}).

Here's an example using curl:

curl -X POST http://localhost:8000/add \
-H "Content-Type: application/json" \
-d '{
    "data": "Alice prefers Python for web development.",
    "user_id": "user_alice",
    "metadata": {"topic": "preferences", "language": "Python"}
}'

And here's how you might do it with the Python requests library:

import requests
import json

url = "http://localhost:8000/add"
payload = {
    "data": "Alice prefers Python for web development.",
    "user_id": "user_alice",
    "metadata": {"topic": "preferences", "language": "Python"}
}
headers = {"Content-Type": "application/json"}

response = requests.post(url, data=json.dumps(payload), headers=headers)
print(response.json())

The server will respond with JSON confirming the addition, often including the unique ID of the newly created memory, for instance: {"message": "Memory added successfully", "memory_id": "some-unique-id"}. Each call to /add creates a new memory record. The user_id is vital for multi-user applications or segregating context, and metadata significantly enhances searchability and organization.

Step 7: How to Search for Stored Memories via API

Once memories are stored, the /search endpoint enables you to perform semantic searches. This means you can retrieve information relevant to a given query and user context, even if your search terms don't exactly match the stored text.

You send a POST request to /search with a JSON body containing the query string, the user_id whose memories you want to search, and optional parameters like filters (for more specific criteria, based on metadata fields – the exact structure depends on Mem0's implementation) and limit (to cap the number of search results).

A curl example looks like this:

curl -X POST http://localhost:8000/search \
-H "Content-Type: application/json" \
-d '{
    "query": "What are Alices language preferences?",
    "user_id": "user_alice",
    "limit": 3
}'

Using Python requests:

import requests
import json

url = "http://localhost:8000/search"
payload = {
    "query": "What are Alices language preferences?",
    "user_id": "user_alice",
    "limit": 3
}
headers = {"Content-Type": "application/json"}

response = requests.post(url, data=json.dumps(payload), headers=headers)
print(response.json())

The expected response is a JSON array of memory objects that match the search query. These objects typically include the memory content, its metadata, and a relevance score indicating how closely it matched the query, for example:
[{"id": "some-unique-id", "data": "Alice prefers Python for web development.", ..., "score": 0.85}].

Pro Tip: Using user_id correctly in searches is essential. It ensures that users only retrieve their own relevant memories, which is vital for maintaining privacy and delivering personalized experiences.

Step 8: How to Retrieve Specific Memories with the API

Beyond searching, the API provides endpoints to list multiple memories or fetch a single, specific memory by its unique identifier. These are useful for browsing the memory bank or accessing known pieces of information directly.

To list multiple memories, you use the GET /get_all endpoint. You can pass query parameters like user_id to filter memories by user, limit to specify the number of memories to return, and page for pagination if there are many memories. For instance, curl -X GET "http://localhost:8000/get_all?user_id=user_alice&limit=5" would retrieve the first 5 memories for "user_alice". The response will be a JSON array of memory objects.

To get a specific memory, use GET /get with a memory_id query parameter: curl -X GET "http://localhost:8000/get?memory_id=some-unique-id". This will return a JSON object representing that single memory, or an error if it's not found.

Here's how to use Python requests for /get_all:

import requests

url = "http://localhost:8000/get_all"
params = {"user_id": "user_alice", "limit": 5}
response = requests.get(url, params=params)
print(response.json())

The /get_all endpoint is particularly good for administrative tasks or when you need to browse through a user's memory. /get is your go-to when you already have a memory's ID, perhaps from an /add response or a previous search result.

Step 9: How to Update and Delete Memories Using the API

To ensure your AI's knowledge base remains accurate and relevant, the API provides the necessary controls to update existing memory records or remove them entirely.

To update a memory, send a PUT request to /update with the memory_id as a query parameter. The request body should be JSON and can include new data (text content) or new metadata (which will replace the existing metadata for that memory). For example:

curl -X PUT "http://localhost:8000/update?memory_id=some-unique-id" \
-H "Content-Type: application/json" \
-d '{"data": "Alice now also enjoys Go for systems programming."}'

To delete a specific memory, use a DELETE request to /delete with the memory_id as a query parameter: curl -X DELETE "http://localhost:8000/delete?memory_id=some-unique-id".

For broader cleanup, the DELETE /delete_all endpoint can be used. If you provide a user_id query parameter (e.g., curl -X DELETE "http://localhost:8000/delete_all?user_id=user_alice"), it will delete all memories for that specific user. If the user_id is omitted, the behavior might be to delete all memories in the system, so use this with extreme caution and always verify its exact behavior in the Mem0 documentation. The server will typically respond with JSON messages confirming the success or failure of these operations.

Pro Tip: Always implement careful checks and user confirmations in your application logic before calling deletion endpoints, especially delete_all, to prevent accidental data loss.

Conclusion

You've now walked through the essential steps of setting up a local OpenMemory MCP Server with the Mem0 open-source framework and using its REST API. By following these instructions to install Mem0, run the server, and interact with its API, you gain direct control over your AI applications' memory. This fosters new levels of persistence, personalization, and privacy. The power to programmatically manage memories locally opens vast possibilities for smarter, context-aware systems—from chatbots that recall user preferences to developer assistants that remember project details. The ability to create local, API-driven AI memory is now yours. Experiment with this API, and start building AI applications that truly learn and evolve.