← All Posts
How We Built It6 min read

I Gave Claude Code a Permanent Brain — Free, Local, 60 Seconds

Your AI agent forgets everything between sessions. Here's how to fix that with one command — no cloud, no API keys, no cost.

Varun Pratap Bhardwaj·

Every Claude Code session starts the same way. You explain your project. Again. For the tenth time. You describe the folder structure, the tech stack, the conventions your team follows, the bugs you already fixed yesterday. The agent listens patiently, responds brilliantly — and then forgets everything the moment you close the terminal.

If you have spent any time on r/ClaudeAI, you know this is the number one complaint. "Why does Claude forget everything?" "I spend 10 minutes of every session re-explaining context." "Is there a way to make it remember?"

There is. I built it.

The Problem: AI Agents Have Amnesia

Large language models do not have persistent memory. Every session starts from a clean slate. The context window is powerful — Claude can hold up to a million tokens in a single conversation — but the moment that conversation ends, it is gone. No learning carries over. No decisions persist. No mistakes are remembered.

For a quick question-and-answer session, this is fine. For serious development work — where you are building a product across weeks and months — it is crippling. You end up maintaining handoff documents, writing elaborate CLAUDE.md files, and still losing context at the edges.

The fundamental issue is architectural. LLMs are stateless by design. To make them stateful, you need an external memory layer. And that layer needs to be smart enough to know what matters, fast enough to not slow you down, and private enough that your proprietary code never leaves your machine.

The 60-Second Fix

npm install superlocalmemory

That is it. One command. SuperLocalMemory installs as an MCP server that integrates directly with Claude Code, Cursor, VS Code, Windsurf, and 17+ other MCP-compatible clients. No cloud account. No API keys. No monthly subscription.

Once installed, your agent starts remembering. It observes what you work on, what decisions you make, what patterns matter in your codebase. The next time you open a session, that context is already there — injected automatically before your first message.

Here is what changes:

  • Session 1: You explain your project uses Next.js 16, Turso for the database, and Clerk for auth. You set up the folder structure and define conventions.
  • Session 2: The agent already knows all of this. It remembers the schema you designed yesterday, the bug you fixed in the auth middleware, and the naming convention you prefer for API routes.
  • Session 10: The agent has a deep understanding of your codebase — architectural decisions, past mistakes, team preferences, deployment quirks. It does not just respond to your current question. It responds with the full weight of everything you have built together.

This is not retrieval-augmented generation bolted onto a chatbot. This is a cognitive memory system designed from the ground up for AI agents.

How It Works

SuperLocalMemory uses information geometry — specifically Fisher-Rao distance on statistical manifolds — to score the importance of every memory. Not all information is equal. The framework your project uses matters more than the color of a button you changed three weeks ago. Fisher-Rao scoring captures this by measuring how much a piece of information changes the statistical structure of what the agent knows.

Retrieval happens through five independent channels:

1. Temporal — What happened recently? Recency matters for active development.

2. Semantic — What is conceptually related to the current query? Embedding-based similarity search.

3. Episodic — What happened in similar past sessions? Pattern matching across your development history.

4. Graph — What is structurally connected? Memory relationships form a knowledge graph that captures how concepts relate.

5. Embedding — Dense vector retrieval for precise matching when the other channels surface too much noise.

These five channels fire in parallel, and results are fused using a weighted scoring system. The agent gets the most relevant context, not just the most recent context.

Everything runs locally. Your memories are stored on your filesystem — ~/.superlocalmemory/ by default. No data leaves your machine. No cloud sync. No telemetry. You own your data completely.

The Math Behind It (Brief Version)

For the technically curious: SuperLocalMemory models the memory space as a Riemannian manifold where each memory is a point and the Fisher information metric defines distances between them. When new information arrives, the system computes its Fisher-Rao distance from existing memories to determine if it is genuinely novel or redundant.

Contradiction detection uses sheaf cohomology — a technique from algebraic topology. When two memories disagree (say, you changed your database from PostgreSQL to Turso mid-project), the system detects the topological obstruction and resolves it by keeping the more recent, higher-importance memory while preserving the historical record.

This is not marketing language. These are real algorithms with real papers behind them. If you want the full depth, the three published papers are on arXiv:

Real Results

Numbers matter more than claims. On the LoCoMo benchmark — the standard evaluation for long-context conversational memory — SuperLocalMemory scores 74.8%, compared to Mem0's 64.2%. That is a 10.6 percentage point advantage, and it comes entirely from local computation. No cloud APIs. No expensive inference calls.

The benchmark tests what matters in practice: can the system recall specific facts from long conversations, track entity changes over time, handle multi-hop reasoning across sessions, and detect when information has been updated or contradicted.

Why Local-First Matters

If you work at any company with a security team, you already know the answer. Sending your codebase context to a third-party memory service is a non-starter. It does not matter how good the encryption is — if your proprietary code touches someone else's server, you have a compliance problem.

The EU AI Act adds another dimension. AI systems that process personal data or make consequential decisions now have transparency and data residency requirements. A local-first memory system sidesteps these concerns entirely. Your data stays on your hardware. Full stop.

Beyond compliance, there is a practical argument: latency. Cloud-based memory adds network round trips to every agent interaction. Local memory is instant. The retrieval pipeline completes in milliseconds, not seconds. Your agent feels faster because it is faster.

And there is no vendor lock-in. SuperLocalMemory stores data in SQLite and plain files. If you want to stop using it tomorrow, your data is right there in a standard format. No export process. No migration headache.

Get Started

Install it:

npm install superlocalmemory

Or if you prefer Python:

pip install superlocalmemory

It works with Claude Code, Cursor, VS Code, Windsurf, and any MCP-compatible client. Setup takes 60 seconds. The agent starts building memory from your very first session.

SuperLocalMemory is open source with over 5,117 monthly downloads on npm. Three peer-reviewed papers back the architecture. Seventeen IDE integrations and counting.

The code is on GitHub. The docs are at superlocalmemory.com. The papers are linked above.

Your AI agent does not have to forget. Give it a brain.

superlocalmemoryclaude-codeai-memorylocal-firstmcp

This post is about superlocalmemory

Enjoyed this post?

Subscribe to get weekly AI agent reliability insights.

Subscribe to Newsletter