Your AI assistant forgets everything when you start a new chat. After 50 tool uses, context explodes to 500k+ tokensβClaude slows down, you restart, and now it remembers nothing. You spend 10 minutes re-explaining your architecture. Again.
MCP Memory Service solves this.
It automatically captures your project context, architecture decisions, and code patterns. When you start fresh sessions, your AI already knows everythingβno re-explaining, no context loss, no wasted time.
Claude Code Β· Gemini Code Assist Β· Aider Β· GitHub Copilot CLI Β· Amp Β· Continue Β· Zed Β· Cody
Claude Desktop Β· VS Code Β· Cursor Β· Windsurf Β· Raycast Β· JetBrains Β· Sourcegraph Β· Qodo
Works seamlessly with any MCP-compatible client - whether you code in the terminal, IDE, or browser.
Express Install (recommended for most users):
pip install mcp-memory-service
# Auto-configure for Claude Desktop (macOS/Linux)
python -m mcp_memory_service.scripts.installation.install --quickWhat just happened?
- β Installed memory service
- β Configured optimal backend (SQLite)
- β Set up Claude Desktop integration
- β Enabled automatic context capture
Next: Restart Claude Desktop. Your AI now remembers everything across sessions.
π¦ Alternative: PyPI + Manual Configuration
pip install mcp-memory-serviceThen add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS):
{
"mcpServers": {
"memory": {
"command": "memory",
"args": ["server"]
}
}
}π§ Advanced: Custom Backends & Team Setup
For production deployments, team collaboration, or cloud sync:
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
python scripts/installation/install.pyChoose from:
- SQLite (local, fast, single-user)
- Cloudflare (cloud, multi-device sync)
- Hybrid (best of both: 5ms local + background cloud sync)
| Session 1 | Session 2 (Fresh Start) |
|---|---|
| You: "We're building a Next.js app with Prisma and tRPC" | AI: "What's your tech stack?" β |
| AI: "Got it, I see you're using App Router" | You: Explains architecture again for 10 minutes π€ |
| You: "Add authentication with NextAuth" | AI: "Should I use Pages Router or App Router?" β |
| Session 1 | Session 2 (Fresh Start) |
|---|---|
| You: "We're building a Next.js app with Prisma and tRPC" | AI: "I rememberβNext.js App Router with Prisma and tRPC. What should we build?" β |
| AI: "Got it, I see you're using App Router" | You: "Add OAuth login" |
| You: "Add authentication with NextAuth" | AI: "I'll integrate NextAuth with your existing Prisma setup." β |
Result: Zero re-explaining. Zero context loss. Just continuous, intelligent collaboration.
π§ Persistent Memory β Context survives across sessions with semantic search
π Smart Retrieval β Finds relevant context automatically using AI embeddings
β‘ 5ms Speed β Instant context injection, no latency
π Multi-Client β Works across 13+ AI applications
βοΈ Cloud Sync β Optional Cloudflare backend for team collaboration
π Privacy-First β Local-first, you control your data
π Web Dashboard β Visualize and manage memories at http://localhost:8000
Hybrid Backend Maintenance Enhancement
- π§Ή Hybrid Association Cleanup Script - New maintenance script for multi-PC setups with hybrid backend
- π Drift Prevention - Removes association memories from BOTH Cloudflare D1 AND local SQLite
- π‘οΈ Robust Error Handling - Graceful Vectorize API error handling (JSON decode errors, network timeouts)
- π― Flexible Modes -
--skip-vectorize,--cloudflare-only,--local-onlyoptions with dry-run support
Previous Releases:
- v8.52.1 - Windows Embedding Fallback & Script Portability (DLL init failure fix, MCP_HTTP_PORT support)
- v8.52.0 - Time-of-Day Emoji Icons (8 time-segment indicators, dark mode support, automatic timezone)
- v8.51.0 - Graph Database Architecture (30x query performance, 97% storage reduction for associations)
- v8.50.1 - Critical Bug Fixes (MCP_EMBEDDING_MODEL fix, installation script backend support, i18n quality analytics complete)
- v8.50.0 - Fallback Quality Scoring (DeBERTa + MS-MARCO hybrid, technical content rescue, 20/20 tests passing)
- v8.49.0 - DeBERTa Quality Classifier (absolute quality assessment, eliminates self-matching bias)
- v8.48.4 - Cloudflare D1 Drift Detection Performance (10-100x faster queries, numeric comparison fix)
- v8.48.3 - Code Execution Hook Fix - 75% token reduction now working (fixed time_filter parameter, Python warnings, venv detection)
- v8.48.2 - HTTP Server Auto-Start & Time Parser Improvements (smart service management, "last N periods" support)
- v8.48.1 - Critical Hotfix - Startup Failure Fix (redundant calendar import removed, immediate upgrade required)
- v8.48.0 - CSV-Based Metadata Compression (78% size reduction, 100% sync success, metadata validation)
- v8.47.1 - ONNX Quality Evaluation Bug Fixes (self-match fix, association pollution, sync queue overflow, realistic distribution)
- v8.47.0 - Association-Based Quality Boost (connection-based enhancement, network effect leverage, metadata persistence)
- v8.46.3 - Quality Score Persistence Fix (ONNX scores in hybrid backend, metadata normalization)
- v8.46.2 - Session-Start Hook Crash Fix + Hook Installer Improvements (client-side tag filtering, isolated version metadata)
- v8.46.1 - Windows Hooks Installer Fix + Quality System Integration (UTF-8 console configuration, backend quality scoring)
- v8.45.3 - ONNX Ranker Model Export Fix (automatic model export, offline mode support, 7-16ms CPU performance)
- v8.45.2 - Dashboard Dark Mode Consistency Fixes (global CSS overrides, Chart.js dark mode support)
- v8.45.1 - Quality System Test Infrastructure Fixes (HTTP API router, storage retrieval, async test client)
- v8.45.0 - Memory Quality System - AI-Driven Automatic Quality Scoring (ONNX-powered local SLM, multi-tier fallback, quality-based retention)
- v8.44.0 - Multi-Language Expansion (Japanese, Korean, German, French, Spanish - 359 keys each, complete i18n coverage)
- v8.43.0 - Internationalization & Quality Automation (English/Chinese i18n, Claude branch automation, quality gates)
- v8.42.1 - MCP Resource Handler Fix (
AttributeErrorwith Pydantic AnyUrl objects) - v8.42.0 - Memory Awareness Enhancements (visible memory injection, quality session summaries, LLM-powered summarization)
- v8.41.2 - Hook Installer Utility File Deployment (ALL 14 utilities copied, future-proof glob pattern)
- v8.41.1 - Context Formatter Memory Sorting (recency sorting within categories, newest first)
- v8.41.0 - Session Start Hook Reliability Improvements (error suppression, clean output, memory filtering, classification fixes)
- v8.40.0 - Session Start Version Display (automatic version comparison, PyPI status labels)
- v8.39.1 - Dashboard Analytics Bug Fixes: Three critical fixes (top tags filtering, recent activity display, storage report fields)
- v8.39.0 - Performance Optimization: Storage-layer date-range filtering (10x faster analytics, 97% data transfer reduction)
- v8.38.1 - Critical Hotfix: HTTP MCP JSON-RPC 2.0 compliance fix (Claude Code/Desktop connection failures resolved)
- v8.38.0 - Code Quality: Phase 2b COMPLETE (~176-186 lines duplicate code eliminated, 10 consolidations)
- v8.37.0 - Code Quality: Phase 2a COMPLETE (5 duplicate high-complexity functions eliminated)
- v8.36.1 - Critical Hotfix: HTTP server startup crash fix (forward reference error in analytics.py)
- v8.36.0 - Code Quality: Phase 2 COMPLETE (100% of target achieved, -39 complexity points)
- v8.35.0 - Code Quality: Phase 2 Batch 1 (install.py, cloudflare.py, -15 complexity points)
- v8.34.0 - Code Quality: Phase 2 Complexity Reduction (analytics.py refactored, 11 β 6-7 complexity)
- v8.33.0 - Critical Installation Bug Fix + Code Quality Improvements (dead code cleanup, automatic MCP setup)
- v8.32.0 - Code Quality Excellence: pyscn Static Analysis Integration (multi-layer QA workflow)
- v8.31.0 - Revolutionary Batch Update Performance (21,428x faster memory consolidation)
- v8.30.0 - Analytics Intelligence: Adaptive Charts & Critical Data Fixes (accurate trend visualization)
- v8.28.1 - Critical HTTP MCP Transport JSON-RPC 2.0 Compliance Fix (Claude Code compatibility)
- v8.28.0 - Cloudflare AND/OR Tag Filtering (unified search API, 3-5x faster hybrid sync)
- v8.27.1 - Critical Hotfix: Timestamp Regression (created_at preservation during metadata sync)
- v8.26.0 - Revolutionary MCP Performance (534,628x faster tools, 90%+ cache hit rate)
- v8.25.0 - Hybrid Backend Drift Detection (automatic metadata sync, bidirectional awareness)
- v8.24.4 - Code Quality Improvements from Gemini Code Assist (regex sanitization, DOM caching)
- v8.24.3 - Test Coverage & Release Agent Improvements (tag+time filtering tests, version history fix)
- v8.24.2 - CI/CD Workflow Fixes (bash errexit handling, exit code capture)
- v8.24.1 - Test Infrastructure Improvements (27 test failures resolved, 63% β 71% pass rate)
- v8.24.0 - PyPI Publishing Enabled (automated package publishing via GitHub Actions)
- v8.23.1 - Stale Virtual Environment Prevention System (6-layer developer protection)
- v8.23.0 - Consolidation Scheduler via Code Execution API (88% token reduction)
π Full Details: CHANGELOG.md | All Releases
- Installation Guide β Detailed setup instructions
- Configuration Guide β Backend options and customization
- Architecture Overview β How it works under the hood
- Team Setup Guide β OAuth and cloud collaboration
- Troubleshooting β Common issues and solutions
- API Reference β Programmatic usage
- Wiki β Complete documentation
β AI-powered documentation assistant
We welcome contributions! See CONTRIBUTING.md for guidelines.
Quick Development Setup:
git clone https://github.com/doobidoo/mcp-memory-service.git
cd mcp-memory-service
pip install -e . # Editable install
pytest tests/ # Run test suiteIf this saves you time, give us a star! β
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Wiki: Documentation Wiki
Apache 2.0 β See LICENSE for details.
Never explain your project to AI twice.
Start using MCP Memory Service today.
Updating from an older version? Scripts have been reorganized for better maintainability:
- Recommended: Use
python -m mcp_memory_service.serverin your Claude Desktop config (no path dependencies!) - Alternative 1: Use
uv run memory serverwith UV tooling - Alternative 2: Update path from
scripts/run_memory_server.pytoscripts/server/run_memory_server.py - Backward compatible: Old path still works with a migration notice
On your first run, you'll see some warnings that are completely normal:
- "WARNING: Failed to load from cache: No snapshots directory" - The service is checking for cached models (first-time setup)
- "WARNING: Using TRANSFORMERS_CACHE is deprecated" - Informational warning, doesn't affect functionality
- Model download in progress - The service automatically downloads a ~25MB embedding model (takes 1-2 minutes)
These warnings disappear after the first successful run. The service is working correctly! For details, see our First-Time Setup Guide.
sqlite-vec may not have pre-built wheels for Python 3.13 yet. If installation fails:
- The installer will automatically try multiple installation methods
- Consider using Python 3.12 for the smoothest experience:
brew install [email protected] - Alternative: Use Cloudflare backend with
--storage-backend cloudflare - See Troubleshooting Guide for details
macOS users may encounter enable_load_extension errors with sqlite-vec:
- System Python on macOS lacks SQLite extension support by default
- Solution: Use Homebrew Python:
brew install python && rehash - Alternative: Use pyenv:
PYTHON_CONFIGURE_OPTS='--enable-loadable-sqlite-extensions' pyenv install 3.12.0 - Fallback: Use Cloudflare or Hybrid backend:
--storage-backend cloudflareor--storage-backend hybrid - See Troubleshooting Guide for details
Intelligent Context Injection - See how the memory service automatically surfaces relevant information at session start:
What you're seeing:
- π§ Automatic memory injection - 8 relevant memories found from 2,526 total
- π Smart categorization - Recent Work, Current Problems, Additional Context
- π Git-aware analysis - Recent commits and keywords automatically extracted
- π― Relevance scoring - Top memories scored at 100% (today), 89% (8d ago), 84% (today)
- β‘ Fast retrieval - SQLite-vec backend with 5ms read performance
- π Background sync - Hybrid backend syncing to Cloudflare
Result: Claude starts every session with full project context - no manual prompting needed.
π Visit our comprehensive Wiki for detailed guides:
- Natural Memory Triggers v7.1.3 Guide - Intelligent automatic memory awareness
- β 85%+ trigger accuracy with semantic pattern detection
- β Multi-tier performance (50ms instant β 150ms fast β 500ms intensive)
- β CLI management system for real-time configuration
- β Git-aware context integration for enhanced relevance
- β Zero-restart installation with dynamic hook loading
- π OAuth 2.1 Setup Guide - NEW! Complete OAuth 2.1 Dynamic Client Registration guide
- π Integration Guide - Claude Desktop, Claude Code HTTP transport, VS Code, and more
- π‘οΈ Advanced Configuration - Updated! OAuth security, enterprise features
- π Memory Consolidation System Guide - NEW! Automated memory maintenance with real-world performance metrics
- β Dream-inspired consolidation (decay scoring, association discovery, compression, archival)
- β 24/7 automatic scheduling (daily/weekly/monthly via HTTP server)
- β Token-efficient Code Execution API (90% token reduction vs MCP tools)
- β Real-world performance data (4-6 min for 2,495 memories with hybrid backend)
- β Three manual trigger methods (HTTP API, MCP tools, Python API)
- π Installation Guide - Complete installation for all platforms and use cases
- π₯οΈ Platform Setup Guide - Windows, macOS, and Linux optimizations
- β‘ Performance Optimization - Speed up queries, optimize resources, scaling
- π¨βπ» Development Reference - Claude Code hooks, API reference, debugging
- π§ Troubleshooting Guide - Updated! OAuth troubleshooting + common issues
- β FAQ - Frequently asked questions
- π Examples - Practical code examples and workflows
- π Repository Statistics - 10 months of development metrics, activity patterns, and insights
- ποΈ Architecture Specs - Search enhancement specifications and design documents
- π©βπ» Development Docs - AI agent instructions, release checklist, refactoring notes
- π Deployment Guides - Docker, dual-service, and production deployment
- π Additional Guides - Storage backends, migration, mDNS discovery
- Hybrid Backend - Fast 5ms local SQLite + background Cloudflare sync (RECOMMENDED default)
- Zero user-facing latency for cloud operations
- Automatic multi-device synchronization
- Graceful offline operation
- Zero Database Locks - Concurrent HTTP + MCP server access works flawlessly
- Auto-configured SQLite pragmas (
busy_timeout=15000,cache_size=20000) - WAL mode with proper multi-client coordination
- Tested: 5/5 concurrent writes succeeded with no errors
- Auto-configured SQLite pragmas (
- Auto-Configuration - Installer handles everything
- SQLite pragmas for concurrent access
- Cloudflare credentials with connection testing
- Claude Desktop integration with hybrid backend
- Graceful fallback to sqlite_vec if cloud setup fails
- Interactive Web UI - Drag-and-drop document upload with real-time progress
- Multiple Formats - PDF, TXT, MD, JSON with intelligent chunking
- Document Viewer - Browse chunks, view metadata, search content
- Smart Tagging - Automatic tagging with length validation (max 100 chars)
- Optional semtools - Enhanced PDF/DOCX/PPTX parsing with LlamaParse
- Security Hardened - Path traversal protection, XSS prevention, input validation
- 7 New Endpoints - Complete REST API for document management
- OAuth 2.1 Dynamic Client Registration - RFC 7591 & RFC 8414 compliant
- Claude Code HTTP Transport - Zero-configuration team collaboration
- JWT Authentication - Enterprise-grade security with scope validation
- Auto-Discovery Endpoints - Seamless client registration and authorization
- Multi-Auth Support - OAuth + API keys + optional anonymous access
- Semantic search with vector embeddings
- Natural language time queries ("yesterday", "last week")
- Tag-based organization with smart categorization
- Memory consolidation with dream-inspired algorithms
- Document-aware search - Query across uploaded documents and manual memories
- Claude Desktop - Native MCP integration
- Claude Code - HTTP transport + Memory-aware development with hooks
- πͺ Windows Support:
/session-startcommand for manual session initialization (workaround for issue #160) - π macOS/Linux: Full automatic SessionStart hooks + slash command
- πͺ Windows Support:
- VS Code, Cursor, Continue - IDE extensions
- 13+ AI applications - REST API compatibility
- Hybrid π (RECOMMENDED) - Fast local SQLite + background Cloudflare sync (v8.9.0 default)
- 5ms local reads with zero user-facing latency
- Multi-device synchronization
- Zero database locks with auto-configured pragmas
- Automatic backups and cloud persistence
- SQLite-vec - Local-only storage (lightweight ONNX embeddings, 5ms reads)
- Good for single-user offline use
- No cloud dependencies
- Cloudflare - Cloud-only storage (global edge distribution with D1 + Vectorize)
- Network-dependent performance
Note: All heavy ML dependencies (PyTorch, sentence-transformers) are now optional to dramatically reduce build times and image sizes. SQLite-vec uses lightweight ONNX embeddings by default. Install with
--with-mlfor full ML capabilities.
- Cross-platform - Windows, macOS, Linux
- Service installation - Auto-start background operation
- HTTPS/SSL - Secure connections with OAuth 2.1
- Docker support - Easy deployment with team collaboration
- Interactive Dashboard - Web UI at http://127.0.0.1:8888/ for complete management
# Start server with web interface
uv run memory server --http
# Access interactive dashboard
open http://127.0.0.1:8888/
# Upload documents via CLI
curl -X POST http://127.0.0.1:8888/api/documents/upload \
-F "[email protected]" \
-F "tags=documentation,reference"
# Search document content
curl -X POST http://127.0.0.1:8888/api/search \
-H "Content-Type: application/json" \
-d '{"query": "authentication flow", "limit": 10}'# Start OAuth-enabled server for team collaboration
export MCP_OAUTH_ENABLED=true
uv run memory server --http
# Claude Code team members connect via HTTP transport
claude mcp add --transport http memory-service http://your-server:8000/mcp
# β Automatic OAuth discovery, registration, and authentication# Store a memory
uv run memory store "Fixed race condition in authentication by adding mutex locks"
# Search for relevant memories
uv run memory recall "authentication race condition"
# Search by tags
uv run memory search --tags python debugging
# Check system health (shows OAuth status)
uv run memory healthRecommended approach - Add to your Claude Desktop config (~/.claude/config.json):
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["-m", "mcp_memory_service.server"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}Alternative approaches:
// Option 1: UV tooling (if using UV)
{
"mcpServers": {
"memory": {
"command": "uv",
"args": ["--directory", "/path/to/mcp-memory-service", "run", "memory", "server"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}
// Option 2: Direct script path (v6.17.0+)
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["/path/to/mcp-memory-service/scripts/server/run_memory_server.py"],
"env": {
"MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
}
}
}
}Hybrid Backend (v8.9.0+ RECOMMENDED):
# Hybrid backend with auto-configured pragmas
export MCP_MEMORY_STORAGE_BACKEND=hybrid
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=15000,cache_size=20000"
# Cloudflare credentials (required for hybrid)
export CLOUDFLARE_API_TOKEN="your-token"
export CLOUDFLARE_ACCOUNT_ID="your-account"
export CLOUDFLARE_D1_DATABASE_ID="your-db-id"
export CLOUDFLARE_VECTORIZE_INDEX="mcp-memory-index"
# Enable HTTP API
export MCP_HTTP_ENABLED=true
export MCP_HTTP_PORT=8000
# Security
export MCP_API_KEY="your-secure-key"SQLite-vec Only (Local):
# Local-only storage
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=15000,cache_size=20000"βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β AI Clients β β MCP Memory β β Storage Backend β
β β β Service v8.9 β β β
β β’ Claude DesktopβββββΊβ β’ MCP Protocol βββββΊβ β’ Hybrid π β
β β’ Claude Code β β β’ HTTP Transportβ β (5ms local + β
β (HTTP/OAuth) β β β’ OAuth 2.1 Authβ β cloud sync) β
β β’ VS Code β β β’ Memory Store β β β’ SQLite-vec β
β β’ Cursor β β β’ Semantic β β β’ Cloudflare β
β β’ 13+ AI Apps β β Search β β β
β β’ Web Dashboard β β β’ Doc Ingestion β β Zero DB Locks β
β
β (Port 8888) β β β’ Zero DB Locks β β Auto-Config β
β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
mcp-memory-service/
βββ src/mcp_memory_service/ # Core application
β βββ models/ # Data models
β βββ storage/ # Storage backends
β βββ web/ # HTTP API & dashboard
β βββ server.py # MCP server
βββ scripts/ # Utilities & installation
βββ tests/ # Test suite
βββ tools/docker/ # Docker configuration
- Fork the repository
- Create a feature branch
- Make your changes with tests
- Submit a pull request
See CONTRIBUTING.md for detailed guidelines.
- π Documentation: Wiki - Comprehensive guides
- π Bug Reports: GitHub Issues
- π¬ Discussions: GitHub Discussions
- π§ Troubleshooting: Troubleshooting Guide
- β
Configuration Validator: Run
python scripts/validation/validate_configuration_complete.pyto check your setup - π Backend Sync Tools: See scripts/README.md for CloudflareβSQLite sync
Real-world metrics from active deployments:
- 1700+ memories stored and actively used across teams
- 5ms local reads with hybrid backend (v8.9.0)
- Zero database locks with concurrent HTTP + MCP access (v8.9.0)
- Tested: 5/5 concurrent writes succeeded
- Auto-configured pragmas prevent lock errors
- <500ms response time for semantic search (local & HTTP transport)
- 65% token reduction in Claude Code sessions with OAuth collaboration
- 96.7% faster context setup (15min β 30sec)
- 100% knowledge retention across sessions and team members
- Zero-configuration setup success rate: 98.5% (OAuth + hybrid backend)
Verified MCP Server
Featured AI Tool
- Production-tested across 13+ AI applications
- Community-driven with real-world feedback and improvements
Apache License 2.0 - see LICENSE for details.
Ready to supercharge your AI workflow? π
π Start with our Installation Guide or explore the Wiki for comprehensive documentation.
Transform your AI conversations into persistent, searchable knowledge that grows with you.

