How to Use Claude-mem for Memory Persistence in Claude Code

Comprehensive tutorial on using Claude-mem with Claude Code. Covers automatic memory capture, MCP search tools for querying project history, folder context files, and privacy controls. Enables persistent context across AI-assisted coding sessions with ~10x token efficiency.

Ashley Goolam

Ashley Goolam

4 February 2026

How to Use Claude-mem for Memory Persistence in Claude Code

What if your AI assistant remembered every architectural decision, bug fix, and refactoring session across weeks of development? Claude-mem eliminates the friction of lost context by automatically capturing tool usage observations, compressing them into semantic summaries, and injecting relevant history into every new Claude Code session.

The Problem: Context Amnesia in AI-Assisted Development

Every Claude Code session starts as a blank slate. When you close your terminal or disconnect from a session, Claude forgets everything; your project structure, recent refactoring decisions, debugging discoveries, and architectural patterns. This forces you to repeatedly explain your codebase, burning tokens on redundant context and breaking workflow continuity.

Developers currently work around this by manually maintaining CLAUDE.md files, jotting notes in separate documents, or re-explaining project context at the start of every session. These approaches are brittle, time-consuming, and never capture the full richness of your development history. Claude-mem solves this by automatically observing every tool invocation, compressing the output into searchable semantic memories, and intelligently retrieving relevant context when you need it.

💡
Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demands, and replaces Postman at a much more affordable price!
button

Understanding Claude-mem's Architecture

Claude-mem operates as a persistent memory compression system that hooks into Claude Code's lifecycle. It captures tool outputs—typically 1,000 to 10,000 tokens—and compresses them into roughly 500-token semantic observations using Claude's Agent SDK. These observations are categorized by type (decision, bugfix, feature, refactor, discovery, change) and tagged with relevant concepts and file references, then stored in a local SQLite database with full-text search capabilities.

The system uses five lifecycle hooks to capture context:

This architecture enables progressive disclosure—a layered memory retrieval system that balances coverage with token efficiency. Instead of dumping your entire history into context, Claude-mem retrieves observations in layers, saving approximately 2,250 tokens per session compared to manual context management.

Installation and System Requirements

Claude-mem requires Node.js 18.0.0 or higher, the latest Claude Code with plugin support, and Bun as the JavaScript runtime and process manager (auto-installed if missing). SQLite 3 is bundled for persistent storage. The plugin works cross-platform on Windows, macOS, and Linux.

claude code

Quick Start Installation

Install Claude-mem directly from the plugin marketplace with two commands:

/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

Restart Claude Code after installation. The plugin automatically downloads prebuilt binaries, installs dependencies including Bun and SQLite, configures hooks for session lifecycle management, and auto-starts the worker service on your first session.

claude-mem from the claude code plugin marketplace

Advanced Installation from Source

For development or testing, clone and build from source on github:

git clone https://github.com/thedotmack/claude-mem.git
cd claude-mem
npm install
npm run build
npm run worker:start

This approach is useful if you need to modify the plugin or run beta features like Endless Mode.

claude-mem on github

Post-Installation Verification

After installation, verify everything is working:

cat plugin/hooks/hooks.json
curl http://localhost:37777/api/health
npm run worker:logs

Test context retrieval by starting a new Claude Code session. You should see context from previous sessions automatically loaded in the initial prompt.

Data Storage and Configuration

Claude-mem stores all data locally in ~/.claude-mem/:

Override the default data directory with an environment variable:

export CLAUDE_MEM_DATA_DIR=/custom/path

Configuration Options

Settings are managed in ~/.claude-mem/settings.json (auto-created on first run). Key configurations include:

How Claude-mem Captures and Processes Context

When you use Claude Code with claude-mem enabled, the system captures every tool invocation automatically. Whether Claude reads a file, executes a bash command, searches with glob patterns, or edits code, claude-mem observes the input and output.

The worker service processes these observations and extracts:

This compression happens automatically without manual intervention. The raw tool output might be 5,000 tokens, but the semantic observation stored in the database is roughly 500 tokens—preserving meaning while eliminating noise.

Session Summaries

When Claude finishes responding (triggering the Stop hook), claude-mem automatically generates a session summary containing:

These summaries are injected into future sessions alongside individual observations, providing both granular detail and high-level narrative context.

Using MCP Search Tools to Query Your Memory

Claude-mem exposes four MCP tools that follow a token-efficient 3-layer workflow pattern. This design retrieves context progressively, minimizing token usage while maximizing relevance.

The 3-Layer Workflow

  1. search: Get a compact index with IDs (~50-100 tokens per result)
  2. timeline: Get chronological context around interesting results
  3. get_observations: Fetch full details ONLY for filtered IDs (~500-1,000 tokens per result)

This approach achieves approximately 10x token savings by filtering before fetching full details.

Available MCP Tools

  1. search: Search the memory index with full-text queries. Filter by type, date, or project.
  2. timeline: Get chronological context around a specific observation or query. Useful for understanding what led to a particular decision or bug fix.
  3. get_observations: Fetch full observation details by IDs. Always batch multiple IDs in a single call to minimize overhead.
  4. __IMPORTANT: Workflow documentation that's always visible to Claude, explaining how to use the memory system effectively.

Example Usage Patterns

Find a specific bug fix:

// Step 1: Search for the bug
search(query="authentication bug", type="bugfix", limit=10)

// Step 2: Review index, identify relevant IDs (e.g., #123, #456)

// Step 3: Fetch full details for relevant observations
get_observations(ids=[123, 456])

Explore recent architectural decisions:

search(query="database schema", type="decision", limit=5)

Find everything related to a specific file:

search(query="worker-service.ts", limit=20)

Natural Language Queries

You can ask Claude naturally about your project history:

Claude automatically invokes the appropriate MCP tools to retrieve relevant context, presenting findings with claude-mem:// URI citations that reference specific observations.

Folder Context Files and CLAUDE.md Auto-Generation

Claude-mem automatically generates CLAUDE.md files in project folders, creating activity timelines that complement the global memory database.

How Folder Context Works

When you work with files in a folder, claude-mem:

  1. Identifies unique folder paths from touched files
  2. Queries recent observations relevant to each folder
  3. Generates a formatted timeline of activity
  4. Writes it to CLAUDE.md in that folder (inside <claude-mem-context> tags)

Each folder's CLAUDE.md contains a Recent Activity section showing observation IDs, timestamps, type indicators (bug fixes, features, discoveries), brief titles, and estimated token counts.

User Content Preservation

The auto-generated content is wrapped in <claude-mem-context> tags. Any content you write outside these tags is preserved when the file regenerates. This lets you:

Example CLAUDE.md structure:

# Authentication Module

This folder contains all authentication-related code.
Follow the established patterns for new auth providers.

<claude-mem-context>
# Recent Activity

| ID | Time | Type | Title | Tokens |
|----|------|------|-------|--------|
| #1234 | 4:30 PM | 🔵 | Implemented user authentication | ~250 |
| #1235 | 4:45 PM | 🔴 | Fixed login redirect bug | ~180 |
</claude-mem-context>

## Manual Notes

- OAuth providers go in /providers/
- Session handling uses Redis

Privacy Controls and Security

Claude-mem provides granular privacy controls to prevent sensitive data from entering the memory system.

Private Content Tags

Wrap sensitive content in <private> tags to exclude it from storage:

<private>
API_KEY=sk-live-abc123xyz789
DATABASE_PASSWORD=supersecret456
</private>

The edge processing ensures private content never reaches the database. This is critical for API keys, credentials, and proprietary logic.

Dual-Tag Privacy System

Claude-mem uses a dual-tag approach:

Web Viewer UI and Real-Time Monitoring

Claude-mem runs a web viewer at http://localhost:37777 for real-time memory stream visualization. The interface shows:

This UI is optional for basic usage but invaluable for understanding what claude-mem captures and how it organizes your development history.

Beta Features: Endless Mode

The beta channel offers Endless Mode, a biomimetic memory architecture for extended sessions. Instead of hitting context limits after 50 tool uses, Endless Mode promises roughly 1,000 uses—a 20x increase. It achieves this by compressing tool outputs in real-time, reducing tokens by about 95% and changing scaling from O(N²) quadratic to O(N) linear.

Trade-off: Observation generation adds 60-90 seconds per tool invocation. For deep, thoughtful coding sessions spanning days or weeks, this latency might be acceptable. For rapid-fire tool usage, it could be prohibitive.

Enable beta features from the web viewer UI at http://localhost:37777 → Settings → Version Channel.

Troubleshooting Common Issues

Worker Service Not Starting

If the worker fails to start on port 37777:

lsof -i :37777
export CLAUDE_MEM_WORKER_PORT=8080
bun plugin/scripts/worker-service.cjs

Memory Not Being Saved

If Claude doesn't remember previous sessions:

npm run worker:status
ls -la ~/.claude-mem/claude-mem.db
npm run worker:logs

Context Injection Issues

If too much or too little context appears at session start:

Adjust the observation limit:

export CLAUDE_MEM_CONTEXT_OBSERVATIONS=10  # Reduce
export CLAUDE_MEM_CONTEXT_OBSERVATIONS=100 # Increase

Empty CLAUDE.md Files

If claude-mem creates empty CLAUDE.md files throughout your project, this is a known issue in v9.0.5. Current workarounds include manually deleting created directories, adding patterns to .gitignore, or waiting for the fix in a subsequent release.

Claude Desktop Integration

Claude-mem works with Claude Desktop through MCP server configuration. Add the mcp-search server to your Claude Desktop config, point to the MCP server script in the claude-mem installation, and restart Claude Desktop.

Once configured, ask naturally about past work:

Use the web viewer at localhost:37777 to verify memories are being captured and check Claude Desktop logs if the connection fails.

claude desktop

Manual Worker Management Commands

From the claude-mem directory, you can manage the worker service:

npm run worker:start    # Start worker service
npm run worker:stop     # Stop worker service
npm run worker:restart  # Restart worker service
npm run worker:logs     # View worker logs
npm run worker:status   # Check worker status

Conclusion

Claude-mem transforms Claude Code from a stateless assistant into a persistent development partner that accumulates knowledge about your codebase over time. By automatically capturing tool usage, compressing observations into searchable memories, and intelligently retrieving relevant context, it eliminates the repetitive context-building that slows down AI-assisted development.

The system's progressive disclosure architecture—layered retrieval with MCP tools, folder-based CLAUDE.md files, and privacy controls—provides approximately 10x token efficiency compared to manual context management while maintaining complete data locality and security.

When building APIs or working with external services in your Claude-mem enhanced workflow, streamline your testing with Apidog. It offers visual API testing, automatic documentation generation, and collaborative debugging that complements your persistent memory setup.

button

Explore more

How to Use the Venice API

How to Use the Venice API

Developer guide to Venice API integration using OpenAI-compatible SDKs. Includes authentication setup, multimodal endpoints (text, image, audio), Venice-specific parameters, and privacy architecture with practical implementation examples.

4 February 2026

How to Setup OpenClaw with Claude Code and Gemini 3 Pro (Fast and Easy)

How to Setup OpenClaw with Claude Code and Gemini 3 Pro (Fast and Easy)

Technical guide for configuring OpenClaw with Claude Code and Gemini 3 Pro. Covers installation, Anthropic/Google API authentication, Telegram/WhatsApp channel setup, and managing your AI agent through messaging interfaces.

4 February 2026

Google Genie 3: The Most Impressive AI Model for Creating Interactive Digital Worlds

Google Genie 3: The Most Impressive AI Model for Creating Interactive Digital Worlds

Google Genie 3 is DeepMind's foundation world model that generates interactive, explorable 3D environments from text prompts or single images. This guide covers how it works, architecture, use cases from gaming to education, Vertex AI integration, and limitations.

3 February 2026

Practice API Design-first in Apidog

Discover an easier way to build and use APIs