How to Use Claude-mem for Memory Persistence in Claude Code

What if your AI assistant remembered every architectural decision, bug fix, and refactoring session across weeks of development? Claude-mem eliminates the friction of lost context by automatically capturing tool usage observations, compressing them into semantic summaries, and injecting relevant history into every new Claude Code session.

The Problem: Context Amnesia in AI-Assisted Development

Every Claude Code session starts as a blank slate. When you close your terminal or disconnect from a session, Claude forgets everything; your project structure, recent refactoring decisions, debugging discoveries, and architectural patterns. This forces you to repeatedly explain your codebase, burning tokens on redundant context and breaking workflow continuity.

Developers currently work around this by manually maintaining CLAUDE.md files, jotting notes in separate documents, or re-explaining project context at the start of every session. These approaches are brittle, time-consuming, and never capture the full richness of your development history. Claude-mem solves this by automatically observing every tool invocation, compressing the output into searchable semantic memories, and intelligently retrieving relevant context when you need it.

💡

Want a great API Testing tool that generates beautiful API Documentation?

Want an integrated, All-in-One platform for your Developer Team to work together with maximum productivity?

Apidog delivers all your demands, and replaces Postman at a much more affordable price!

button

Understanding Claude-mem's Architecture

Claude-mem operates as a persistent memory compression system that hooks into Claude Code's lifecycle. It captures tool outputs—typically 1,000 to 10,000 tokens—and compresses them into roughly 500-token semantic observations using Claude's Agent SDK. These observations are categorized by type (decision, bugfix, feature, refactor, discovery, change) and tagged with relevant concepts and file references, then stored in a local SQLite database with full-text search capabilities.

The system uses five lifecycle hooks to capture context:

SessionStart: Injects context from previous sessions when you begin
UserPromptSubmit: Captures your queries for pattern recognition
PostToolUse: Observes every tool execution and its output
Stop: Generates session summaries when Claude finishes responding
SessionEnd: Finalizes session storage and cleanup

This architecture enables progressive disclosure—a layered memory retrieval system that balances coverage with token efficiency. Instead of dumping your entire history into context, Claude-mem retrieves observations in layers, saving approximately 2,250 tokens per session compared to manual context management.

Installation and System Requirements

Claude-mem requires Node.js 18.0.0 or higher, the latest Claude Code with plugin support, and Bun as the JavaScript runtime and process manager (auto-installed if missing). SQLite 3 is bundled for persistent storage. The plugin works cross-platform on Windows, macOS, and Linux.

Quick Start Installation

Install Claude-mem directly from the plugin marketplace with two commands:

/plugin marketplace add thedotmack/claude-mem
/plugin install claude-mem

Restart Claude Code after installation. The plugin automatically downloads prebuilt binaries, installs dependencies including Bun and SQLite, configures hooks for session lifecycle management, and auto-starts the worker service on your first session.

claude-mem from the claude code plugin marketplace

Advanced Installation from Source

For development or testing, clone and build from source on github:

git clone https://github.com/thedotmack/claude-mem.git
cd claude-mem
npm install
npm run build
npm run worker:start

This approach is useful if you need to modify the plugin or run beta features like Endless Mode.

Post-Installation Verification

After installation, verify everything is working:

Check plugin installation:

cat plugin/hooks/hooks.json

Verify the worker service is running:

curl http://localhost:37777/api/health

View recent worker logs:

npm run worker:logs

Test context retrieval by starting a new Claude Code session. You should see context from previous sessions automatically loaded in the initial prompt.

Data Storage and Configuration

Claude-mem stores all data locally in ~/.claude-mem/:

Database: ~/.claude-mem/claude-mem.db (SQLite with FTS5 search)
PID file: ~/.claude-mem/.worker.pid
Port file: ~/.claude-mem/.worker.port
Logs: ~/.claude-mem/logs/worker-YYYY-MM-DD.log
Settings: ~/.claude-mem/settings.json

Override the default data directory with an environment variable:

export CLAUDE_MEM_DATA_DIR=/custom/path

Configuration Options

Settings are managed in ~/.claude-mem/settings.json (auto-created on first run). Key configurations include:

CLAUDE_MEM_CONTEXT_OBSERVATIONS: Number of observations injected at session start (default: 50)
CLAUDE_MEM_FOLDER_INDEX_ENABLED: Enable/disable auto-generated CLAUDE.md files in folders
Model selection for AI-powered compression
Worker port and host settings
Log level configuration

How Claude-mem Captures and Processes Context

When you use Claude Code with claude-mem enabled, the system captures every tool invocation automatically. Whether Claude reads a file, executes a bash command, searches with glob patterns, or edits code, claude-mem observes the input and output.

The worker service processes these observations and extracts:

Title: Brief description of what happened
Subtitle: Additional context
Narrative: Detailed explanation of the activity
Facts: Key learnings as bullet points
Concepts: Relevant tags and categories for search
Type: Classification (decision, bugfix, feature, refactor, discovery, change)
Files: Which files were read or modified

This compression happens automatically without manual intervention. The raw tool output might be 5,000 tokens, but the semantic observation stored in the database is roughly 500 tokens—preserving meaning while eliminating noise.

Session Summaries

When Claude finishes responding (triggering the Stop hook), claude-mem automatically generates a session summary containing:

Request: What you asked for
Investigated: What Claude explored to answer
Learned: Key discoveries and insights
Completed: What was accomplished
Next Steps: Recommended follow-up actions

These summaries are injected into future sessions alongside individual observations, providing both granular detail and high-level narrative context.

Using MCP Search Tools to Query Your Memory

Claude-mem exposes four MCP tools that follow a token-efficient 3-layer workflow pattern. This design retrieves context progressively, minimizing token usage while maximizing relevance.

The 3-Layer Workflow

search: Get a compact index with IDs (~50-100 tokens per result)
timeline: Get chronological context around interesting results
get_observations: Fetch full details ONLY for filtered IDs (~500-1,000 tokens per result)

This approach achieves approximately 10x token savings by filtering before fetching full details.

Available MCP Tools

search: Search the memory index with full-text queries. Filter by type, date, or project.
timeline: Get chronological context around a specific observation or query. Useful for understanding what led to a particular decision or bug fix.
get_observations: Fetch full observation details by IDs. Always batch multiple IDs in a single call to minimize overhead.
__IMPORTANT: Workflow documentation that's always visible to Claude, explaining how to use the memory system effectively.

Example Usage Patterns

Find a specific bug fix:

// Step 1: Search for the bug
search(query="authentication bug", type="bugfix", limit=10)

// Step 2: Review index, identify relevant IDs (e.g., #123, #456)

// Step 3: Fetch full details for relevant observations
get_observations(ids=[123, 456])

Explore recent architectural decisions:

search(query="database schema", type="decision", limit=5)

Find everything related to a specific file:

search(query="worker-service.ts", limit=20)

Natural Language Queries

You can ask Claude naturally about your project history:

"What did we decide about error handling?"
"How did we implement authentication?"
"What bugs did we fix in the API layer?"
"Show me changes to the database schema"

Claude automatically invokes the appropriate MCP tools to retrieve relevant context, presenting findings with claude-mem:// URI citations that reference specific observations.

Folder Context Files and CLAUDE.md Auto-Generation

Claude-mem automatically generates CLAUDE.md files in project folders, creating activity timelines that complement the global memory database.

How Folder Context Works

When you work with files in a folder, claude-mem:

Identifies unique folder paths from touched files
Queries recent observations relevant to each folder
Generates a formatted timeline of activity
Writes it to CLAUDE.md in that folder (inside <claude-mem-context> tags)

Each folder's CLAUDE.md contains a Recent Activity section showing observation IDs, timestamps, type indicators (bug fixes, features, discoveries), brief titles, and estimated token counts.

User Content Preservation

The auto-generated content is wrapped in <claude-mem-context> tags. Any content you write outside these tags is preserved when the file regenerates. This lets you:

Add your own documentation above or below the generated section
Write folder-specific instructions for Claude
Include architectural notes or conventions

Example CLAUDE.md structure:

# Authentication Module

This folder contains all authentication-related code.
Follow the established patterns for new auth providers.

<claude-mem-context>
# Recent Activity

| ID | Time | Type | Title | Tokens |
|----|------|------|-------|--------|
| #1234 | 4:30 PM | 🔵 | Implemented user authentication | ~250 |
| #1235 | 4:45 PM | 🔴 | Fixed login redirect bug | ~180 |
</claude-mem-context>

## Manual Notes

- OAuth providers go in /providers/
- Session handling uses Redis

Privacy Controls and Security

Claude-mem provides granular privacy controls to prevent sensitive data from entering the memory system.

Private Content Tags

Wrap sensitive content in <private> tags to exclude it from storage:

<private>
API_KEY=sk-live-abc123xyz789
DATABASE_PASSWORD=supersecret456
</private>

The edge processing ensures private content never reaches the database. This is critical for API keys, credentials, and proprietary logic.

Dual-Tag Privacy System

Claude-mem uses a dual-tag approach:

<private>: User-controlled privacy for sensitive content
<claude-mem-context>: System-level tags prevent recursive observation storage

Web Viewer UI and Real-Time Monitoring

Claude-mem runs a web viewer at http://localhost:37777 for real-time memory stream visualization. The interface shows:

Live observation stream with emoji indicators for importance
Session timeline with chronological markers
Search interface for querying memories
Settings panel for configuration adjustments
Version switching between stable and beta channels

This UI is optional for basic usage but invaluable for understanding what claude-mem captures and how it organizes your development history.

Beta Features: Endless Mode

The beta channel offers Endless Mode, a biomimetic memory architecture for extended sessions. Instead of hitting context limits after 50 tool uses, Endless Mode promises roughly 1,000 uses—a 20x increase. It achieves this by compressing tool outputs in real-time, reducing tokens by about 95% and changing scaling from O(N²) quadratic to O(N) linear.

Trade-off: Observation generation adds 60-90 seconds per tool invocation. For deep, thoughtful coding sessions spanning days or weeks, this latency might be acceptable. For rapid-fire tool usage, it could be prohibitive.

Enable beta features from the web viewer UI at http://localhost:37777 → Settings → Version Channel.

Troubleshooting Common Issues

Worker Service Not Starting

If the worker fails to start on port 37777:

Check if the port is already occupied:

lsof -i :37777

Configure an alternative port:

export CLAUDE_MEM_WORKER_PORT=8080

Manually start the worker:

bun plugin/scripts/worker-service.cjs

Memory Not Being Saved

If Claude doesn't remember previous sessions:

Verify the worker is running:

npm run worker:status

Check the database file exists:

ls -la ~/.claude-mem/claude-mem.db

Review worker logs for errors:

npm run worker:logs

Context Injection Issues

If too much or too little context appears at session start:

Adjust the observation limit:

export CLAUDE_MEM_CONTEXT_OBSERVATIONS=10  # Reduce
export CLAUDE_MEM_CONTEXT_OBSERVATIONS=100 # Increase

Empty CLAUDE.md Files

If claude-mem creates empty CLAUDE.md files throughout your project, this is a known issue in v9.0.5. Current workarounds include manually deleting created directories, adding patterns to .gitignore, or waiting for the fix in a subsequent release.

Claude Desktop Integration

Claude-mem works with Claude Desktop through MCP server configuration. Add the mcp-search server to your Claude Desktop config, point to the MCP server script in the claude-mem installation, and restart Claude Desktop.

Once configured, ask naturally about past work:

"What did we do last session?"
"Did we fix this bug before?"
"How did we implement authentication?"

Use the web viewer at localhost:37777 to verify memories are being captured and check Claude Desktop logs if the connection fails.

Manual Worker Management Commands

From the claude-mem directory, you can manage the worker service:

npm run worker:start    # Start worker service
npm run worker:stop     # Stop worker service
npm run worker:restart  # Restart worker service
npm run worker:logs     # View worker logs
npm run worker:status   # Check worker status

Conclusion

Claude-mem transforms Claude Code from a stateless assistant into a persistent development partner that accumulates knowledge about your codebase over time. By automatically capturing tool usage, compressing observations into searchable memories, and intelligently retrieving relevant context, it eliminates the repetitive context-building that slows down AI-assisted development.

The system's progressive disclosure architecture—layered retrieval with MCP tools, folder-based CLAUDE.md files, and privacy controls—provides approximately 10x token efficiency compared to manual context management while maintaining complete data locality and security.

When building APIs or working with external services in your Claude-mem enhanced workflow, streamline your testing with Apidog. It offers visual API testing, automatic documentation generation, and collaborative debugging that complements your persistent memory setup.

button