As developers, you're looking for powerful memory solutions that also guarantee data privacy, control, and smooth interoperability between AI tools. The trend clearly favors local-first, developer-focused memory systems. The OpenMemory MCP Server concept, brought to life by technologies like the open-source Mem0 framework, leads this change. It offers a private, local memory layer, letting AI agents remember context across applications while your data stays secure on your machine. This guide is for developers like you. It's a full walkthrough on installing essential components using Mem0, running your own local OpenMemory MCP Server, and skillfully using its REST API to create smarter, context-aware AI applications.
Let's start unlocking persistent, local AI memory.
Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?
Apidog delivers all your demans, and replaces Postman at a much more affordable price!
Step 1: Preparing Your Development Environment
Before diving into the installation, it's essential to have your basic development tools ready and understand the core components we'll be working with. We'll primarily use the versatile Mem0 open-source framework to establish and then interact with our local OpenMemory MCP Server. This interaction will largely happen via its REST API, providing a standardized way to manage your AI's memory.

You'll need a recent Python environment, preferably version 3.8 or newer, to ensure full compatibility with Mem0 and its dependencies. You can download the latest Python from python.org. Along with Python, you'll need pip
, the Python package installer, which is crucial for installing Mem0 and typically comes bundled with Python installations. The mem0ai
Python library will be our engine; its open-source nature allows you to run a local server that effectively becomes your OpenMemory MCP Server. This entire setup prioritizes running the memory server on your local machine, giving you full control over your data and operations—a key principle of the OpenMemory philosophy. The server then exposes its functionalities through a REST API, with endpoints for adding, searching, and managing memories that align with the standard operations defined by the Model Context Protocol (MCP).
Pro Tip: Using virtual environments (like venv
or conda
) is a smart move. It helps manage project dependencies and prevents clashes with other Python projects.
Step 2: How to Install the Mem0 Framework
Installing the mem0ai
package is your first practical step towards having a local AI memory server. This is done with a simple command that equips your system with the core Mem0 library and all the necessary components.
Open your terminal (and activate your virtual environment if you use one). Then, run the installation command:
pip install mem0ai
This command installs the mem0ai
library, which includes the core Memory
class for direct memory management in Python. You also get support for various Large Language Models (LLMs) and vector databases, plus the FastAPI and Uvicorn components needed to run the Mem0 REST API server. After the installation completes, you can verify it by checking the installed package version with pip show mem0ai
, or by trying a quick import in a Python session:
import mem0
print(mem0.__version__)
pip
also automatically handles installing required dependencies like FastAPI (for building the API), Uvicorn (as the ASGI server), and various data handling libraries. With mem0ai
installed, you have the foundational software. You can use the library directly in Python scripts for memory operations, but for the "server" aspect that MCP clients can interact with, we'll use its built-in server functionality.
Step 3: How to Configure Your Local Memory Server
Before starting the server, you have the option to configure its underlying Mem0 instance. This configuration is key because it determines which Large Language Model (LLM), vector database, and embedding model will power your server's memory processing and retrieval capabilities—essentially, you're customizing its "brain."
The Mem0 server (and the Memory
class it's built upon) is primarily configured using environment variables. This is a flexible approach that lets you set up your backend without modifying any code. For example, to configure the LLM, you might set OPENAI_API_KEY
for OpenAI, ANTHROPIC_API_KEY
for Anthropic, or variables like MEM0_LLM_PROVIDER="ollama"
for local models via Ollama. Similarly, for the vector database, while Mem0 defaults to LanceDB (which is local and requires no extra setup), you can specify alternatives like Qdrant using variables such as MEM0_VECTOR_STORE_PROVIDER="qdrant"
. Embedding model configurations are often tied to your LLM provider but can also be specified independently.
If you don't set any specific environment variables, Mem0 attempts to use sensible defaults. However, it might require an OpenAI key for its default LLM if not otherwise specified. These choices significantly impact your server: the LLM affects how memories are understood, while the vector database influences storage efficiency and search speed. For a comprehensive list of all configurable environment variables and options for different LLMs, vector stores, and embedders, always consult the official Mem0 documentation.
Pro Tip: Begin with a simple local setup (like Ollama for LLM and the default LanceDB). This helps you get comfortable with the system. Then, explore more advanced setups as your needs evolve.
Step 4: How to Run Your OpenMemory MCP Server
With mem0ai
installed and (optionally) configured to your liking, you can now start the local server process. This server is what exposes the REST API endpoints, allowing other applications or scripts to perform memory operations.
The mem0ai
package offers a handy command-line script to run the server. In your terminal, simply type:
mem0-server
You can also specify a custom host and port if needed, for example: mem0-server --host 127.0.0.1 --port 8000
. The default is typically 0.0.0.0
(all available network interfaces) on port 8000
. Once started, Uvicorn (the server running your FastAPI application) will output logs to the console, indicating that the server is running and listening for requests. You'll see messages like: INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
.
Your OpenMemory MCP Server is now running locally! MCP-compatible clients, if configured to point to http://localhost:8000
(or your custom address), can interact with it. You can also directly use its REST API, which we'll explore next. To stop the server, simply press CTRL+C
in the terminal where it's running. The mem0-server
command essentially launches a Uvicorn ASGI server, which runs the FastAPI application defined within Mem0. This application hosts the REST API endpoints, and this running instance is your local OpenMemory MCP Server.
Step 5: Understanding the Server's API Capabilities
The Mem0 server provides a comprehensive REST API that allows you to programmatically manage AI memories. This API adheres to the core principles of the Model Context Protocol by offering standardized ways to add, search, list, and delete information.
The API uses standard HTTP/HTTPS protocols, and JSON is the data format for both request and response bodies. If you're running the server locally with default settings, your base URL will be http://localhost:8000
. The core operations available through the API (based on typical Mem0 REST API implementations) include POST /add
to store new information, POST /search
to retrieve relevant memories, GET /get_all
to list multiple memories (with filter options), GET /get
to retrieve a single memory by its ID, PUT /update
to modify an existing memory, DELETE /delete
to remove a specific memory, and DELETE /delete_all
to clear memories, potentially filtered by user.
By default, the locally run mem0-server
does not implement authentication, as it's primarily intended for local development and use. If you were to consider production or exposed deployments (which is not recommended without significant additional security layers), authentication would be a critical aspect to implement.
Pro Tip: You can often explore the API interactively. Try visiting http://localhost:8000/docs
in your browser once the server is up. Mem0's FastAPI server usually offers this helpful documentation page through Swagger UI.
Step 6: How to Add Memories Using the API
The /add
endpoint is your primary tool for ingesting data into the memory server. It allows you to store textual information, associate it with a specific user, and include optional metadata for better organization and retrieval.
To use it, you send a POST
request to the /add
endpoint. The request body should be JSON and include data
(the string content of the memory), user_id
(an identifier for the user associated with this memory – crucial for organizing and retrieving user-specific memories), and optionally metadata
(a JSON object containing any additional key-value pairs you want to store, like {"source": "chat_log"}
).
Here's an example using curl
:
curl -X POST http://localhost:8000/add \
-H "Content-Type: application/json" \
-d '{
"data": "Alice prefers Python for web development.",
"user_id": "user_alice",
"metadata": {"topic": "preferences", "language": "Python"}
}'
And here's how you might do it with the Python requests
library:
import requests
import json
url = "http://localhost:8000/add"
payload = {
"data": "Alice prefers Python for web development.",
"user_id": "user_alice",
"metadata": {"topic": "preferences", "language": "Python"}
}
headers = {"Content-Type": "application/json"}
response = requests.post(url, data=json.dumps(payload), headers=headers)
print(response.json())
The server will respond with JSON confirming the addition, often including the unique ID of the newly created memory, for instance: {"message": "Memory added successfully", "memory_id": "some-unique-id"}
. Each call to /add
creates a new memory record. The user_id
is vital for multi-user applications or segregating context, and metadata significantly enhances searchability and organization.
Step 7: How to Search for Stored Memories via API
Once memories are stored, the /search
endpoint enables you to perform semantic searches. This means you can retrieve information relevant to a given query and user context, even if your search terms don't exactly match the stored text.
You send a POST
request to /search
with a JSON body containing the query
string, the user_id
whose memories you want to search, and optional parameters like filters
(for more specific criteria, based on metadata fields – the exact structure depends on Mem0's implementation) and limit
(to cap the number of search results).
A curl
example looks like this:
curl -X POST http://localhost:8000/search \
-H "Content-Type: application/json" \
-d '{
"query": "What are Alices language preferences?",
"user_id": "user_alice",
"limit": 3
}'
Using Python requests
:
import requests
import json
url = "http://localhost:8000/search"
payload = {
"query": "What are Alices language preferences?",
"user_id": "user_alice",
"limit": 3
}
headers = {"Content-Type": "application/json"}
response = requests.post(url, data=json.dumps(payload), headers=headers)
print(response.json())
The expected response is a JSON array of memory objects that match the search query. These objects typically include the memory content, its metadata, and a relevance score indicating how closely it matched the query, for example:[{"id": "some-unique-id", "data": "Alice prefers Python for web development.", ..., "score": 0.85}]
.
Pro Tip: Using user_id
correctly in searches is essential. It ensures that users only retrieve their own relevant memories, which is vital for maintaining privacy and delivering personalized experiences.
Step 8: How to Retrieve Specific Memories with the API
Beyond searching, the API provides endpoints to list multiple memories or fetch a single, specific memory by its unique identifier. These are useful for browsing the memory bank or accessing known pieces of information directly.
To list multiple memories, you use the GET /get_all
endpoint. You can pass query parameters like user_id
to filter memories by user, limit
to specify the number of memories to return, and page
for pagination if there are many memories. For instance, curl -X GET "http://localhost:8000/get_all?user_id=user_alice&limit=5"
would retrieve the first 5 memories for "user_alice". The response will be a JSON array of memory objects.
To get a specific memory, use GET /get
with a memory_id
query parameter: curl -X GET "http://localhost:8000/get?memory_id=some-unique-id"
. This will return a JSON object representing that single memory, or an error if it's not found.
Here's how to use Python requests
for /get_all
:
import requests
url = "http://localhost:8000/get_all"
params = {"user_id": "user_alice", "limit": 5}
response = requests.get(url, params=params)
print(response.json())
The /get_all
endpoint is particularly good for administrative tasks or when you need to browse through a user's memory. /get
is your go-to when you already have a memory's ID, perhaps from an /add
response or a previous search result.
Step 9: How to Update and Delete Memories Using the API
To ensure your AI's knowledge base remains accurate and relevant, the API provides the necessary controls to update existing memory records or remove them entirely.
To update a memory, send a PUT
request to /update
with the memory_id
as a query parameter. The request body should be JSON and can include new data
(text content) or new metadata
(which will replace the existing metadata for that memory). For example:
curl -X PUT "http://localhost:8000/update?memory_id=some-unique-id" \
-H "Content-Type: application/json" \
-d '{"data": "Alice now also enjoys Go for systems programming."}'
To delete a specific memory, use a DELETE
request to /delete
with the memory_id
as a query parameter: curl -X DELETE "http://localhost:8000/delete?memory_id=some-unique-id"
.
For broader cleanup, the DELETE /delete_all
endpoint can be used. If you provide a user_id
query parameter (e.g., curl -X DELETE "http://localhost:8000/delete_all?user_id=user_alice"
), it will delete all memories for that specific user. If the user_id
is omitted, the behavior might be to delete all memories in the system, so use this with extreme caution and always verify its exact behavior in the Mem0 documentation. The server will typically respond with JSON messages confirming the success or failure of these operations.
Pro Tip: Always implement careful checks and user confirmations in your application logic before calling deletion endpoints, especially delete_all
, to prevent accidental data loss.
Conclusion
You've now walked through the essential steps of setting up a local OpenMemory MCP Server with the Mem0 open-source framework and using its REST API. By following these instructions to install Mem0, run the server, and interact with its API, you gain direct control over your AI applications' memory. This fosters new levels of persistence, personalization, and privacy. The power to programmatically manage memories locally opens vast possibilities for smarter, context-aware systems—from chatbots that recall user preferences to developer assistants that remember project details. The ability to create local, API-driven AI memory is now yours. Experiment with this API, and start building AI applications that truly learn and evolve.