4 Essential Tools32 with --full-toolsBeta

MCP Server

Give any AI agent persistent memory across sessions, projects, and restarts. Lifecycle hooks on Claude Code, MCP tools everywhere else, zero configuration.

New: Claude Code integration

Hooks, not tools

On Claude Code, MenteDB runs through lifecycle hooks instead of MCP tools. Zero tool schemas in the model context, and memory runs on every turn, deterministically. The model cannot forget to remember.

npx mentedb-mcp@latest setup claude-code

UserPromptSubmit

Recalls context for your prompt and injects it before the model responds.

Stop

Stores the completed turn through the full cognitive pipeline.

SessionStart

Re-injects your profile and standing rules at startup, resume, and right after context compaction.

Hooks never block: if memory is unavailable the turn proceeds normally. Backend is MenteDB Cloud when logged in, otherwise a local daemon that starts automatically. MCP tools below remain available for Copilot, Cursor, and Claude Desktop.

Two commands. No config files to edit.

Run setup to configure your editor, then login to enable cloud sync, cross-device memory, and intelligent extraction.

Terminal

$ npx mentedb-mcp@latest setup copilot
$ npx mentedb-mcp@latest login

Supports: claude-code (hooks), copilot, claude, cursor

With login, MenteDB runs entirely in the cloud — contradiction detection, temporal invalidation, entity resolution, and semantic search all handled server-side. No local database, no file locks, multiple sessions run simultaneously.

Optional: run offline with local mode

Local mode ships in the npx binary. It uses an embedded database with on-device Candle embeddings, fully offline, and runs automatically whenever you are not logged in to cloud.

# Force local mode even when logged in
npx mentedb-mcp@latest --local

# Optionally add your own LLM for extraction
export MENTEDB_LLM_PROVIDER=anthropic
export MENTEDB_LLM_API_KEY=sk-ant-...

Use MENTEDB_LLM_PROVIDER=ollama for free local inference with Ollama.

The setup command auto-detects your OS, finds the config path, and writes the MCP server entry. Using npx mentedb-mcp@latest always runs the latest version — no manual updates needed.

Why use the MCP server?

Not just a memory store — a full cognitive layer that makes your agent smarter over time.

Zero Config

Two commands: setup and login. Lightweight 5MB binary, no API keys to manage, no GPU needed — cloud handles everything.

Persistent Memory

Memories survive across sessions, projects, and restarts. Your agent picks up exactly where it left off — on any device.

Multi-Session

No local database locks. Run multiple editor sessions simultaneously — all connected to the same cloud memory store.

Intelligent Extraction

LLM-powered extraction turns every conversation into structured memories. Contradictions are detected at write time. Semantic search via server-side embeddings.

4 Essential Tools (32 with --full-tools) across 6 Categories

Everything an agent needs

From basic CRUD to cognitive signals, the MCP server exposes the full MenteDB surface area over the standard MCP protocol.

Core Memory

Store, retrieve, search, and manage individual memories.

8 tools

store_memory

Store a new memory with type, tags, and metadata

get_memory

Retrieve a specific memory by UUID

recall_memory

Recall a memory and boost its salience

search_memories

Semantic similarity search across all memories

relate_memories

Create typed edges between memories

forget_memory

Delete a single memory

forget_all

Wipe all memories (requires confirmation)

ingest_conversation

Extract structured memories from raw conversation text

Context Assembly

Build optimized context windows within a token budget.

1 tool

assemble_context

Assemble an optimized context window with delta annotations for a given query and token budget

Knowledge Graph

Traverse relationships, find paths, and propagate beliefs.

5 tools

get_related

Find all memories related to a given memory

find_path

Shortest path between two memories in the graph

get_subgraph

Extract local subgraph within N hops

find_contradictions

Find memories that contradict a given memory

propagate_belief

Propagate confidence changes through the graph

Consolidation

Merge, compress, decay, and archive memories over time.

6 tools

consolidate_memories

Cluster similar memories and merge them

apply_decay

Apply salience decay based on time and access

compress_memory

Extract key sentences, remove filler

evaluate_archival

Evaluate memories for archival or deletion

extract_facts

Extract structured subject-predicate-object facts

gdpr_forget

GDPR-compliant data deletion with audit log

Cognitive

Pain signals, phantom detection, trajectory tracking, and interference.

9 tools

record_pain

Record a negative experience for future warnings

detect_phantoms

Scan for entities referenced but not in memory

resolve_phantom

Mark a knowledge gap as resolved

record_trajectory

Track conversation trajectory for predictions

predict_topics

Predict likely next topics based on trajectory

detect_interference

Find memories similar enough to cause confusion

check_stream

Check LLM output against known facts

write_inference

Run write-time inference on a memory

register_entity

Pipeline

Full-turn processing, cognitive state, and database stats.

3 tools

process_turn

Full pipeline: search → extract → store → infer in one call

get_cognitive_state

Pain signals, phantoms, and trajectory predictions

get_stats

Memory count, edge count, and type breakdown

Client Configuration

The setup command handles this automatically, but here are the configs if you prefer manual setup. Use alwaysAllow to skip per-tool approval prompts.

Auto Setup

$ mentedb-mcp setup

Embedding Provider

MenteDB uses local Candle embeddings by default — no API keys needed. For higher accuracy, set an OpenAI or Anthropic key:

$ export OPENAI_API_KEY=sk-...       # or
$ export ANTHROPIC_API_KEY=sk-ant-...

The server auto-detects the key and switches provider. Remove the variable to revert to local embeddings.

Found a bug or have a feature request? Open an issue on GitHub

Manual Config

~/.copilot/mcp-config.json

{
  "mcpServers": {
    "mentedb": {
      "command": "npx",
      "args": ["-y", "mentedb-mcp@latest"],
      "alwaysAllow": [
        "process_turn", "store_memory",
        "search_memories", "forget_memory"
      ]
    }
  }
}

How it works

One tool call per turn. The server handles the rest.

Agent calls process_turn

Once per conversation turn, the agent sends the user message and its response. The server handles everything else automatically.

Server stores, searches & infers

The conversation is stored as searchable episodic memory. Relevant past context is retrieved. Write-time inference detects contradictions, creates edges, and updates confidence scores. Facts are extracted. Phantom entities are flagged.

LLM stores important facts

The LLM recognises preferences, decisions, and corrections during the conversation and calls store_memory with the right type and tags. No second model needed — the conversation LLM is the extraction engine.

Context returned with full signals

Relevant past context, pain warnings, contradiction alerts, phantom counts, topic predictions — all in one round trip. Auto-maintenance (decay, archival, consolidation) runs periodically in the background.

Built-in agent instructions

The setup command injects instructions that teach your LLM how to use MenteDB. No prompt engineering required.

process_turn — every turn

Called on every conversation turn. Searches for relevant past context, stores the conversation as episodic memory, runs write-time inference (contradictions, edges, confidence), extracts facts and links them as edges, detects phantom entities, checks responses against known facts, and tracks trajectory. When an LLM provider is configured, also triggers sleeptime enrichment — automatically extracting semantic facts, linking entities, detecting communities, and building a user profile from accumulated conversations.

context retrievalwrite inferencefact extractionphantom detectionpain warningssleeptime enrichment

store_memory — when it matters

The LLM calls this when it notices important information — preferences, decisions, corrections, procedures. The conversation LLM is the extraction engine. No second model needed.

semantic factscorrectionsanti-patterns

What the setup command adds to your agent

① process_turn every turn (MANDATORY) — searches context, stores episodic memory, runs write-time inference, extracts and links facts, detects phantoms, checks for contradictions, tracks trajectory. Returns context the agent MUST use in its response.

② USE returned context (MANDATORY) — reference past memories, warn on pain signals, flag contradictions, anticipate next topics

③ store_memory when the LLM sees preferences, decisions, corrections, or procedures

④ Memory types — semantic, episodic, procedural, correction, anti_pattern, reasoning

⑤ Scope — contextual (similarity-based) or always (returned every turn for critical rules)

⑥ Tags — project names, topics, context labels for structured recall

⑥ Auto-maintenance — decay every 50 turns, archival every 100, consolidation every 200

⑦ Resilience — even if process_turn fails on a turn, always retry on the next turn. Never skip because of a prior error.

These instructions are written to a copilot-instructions.md or equivalent file for your client. The LLM reads them at session start and follows them automatically.

Ready to give your agent a mind?

Install the MCP server and start building agents that remember.

Get Started Read the Docs