AI StrategyMarch 19, 20267 min read

Markdown Memory: Why Your Agent Should Be Readable.

Open the data folder of any popular AI assistant and you will find a vector database. Twenty thousand chunks of your past conversations, your documents, your calendar entries,…

JP

James Park

Head of Integrations

@@jamesparkdev
#local-first#markdown#elyra#personal-ai#agent-design

Open the data folder of any popular AI assistant and you will find a vector database. Twenty thousand chunks of your past conversations, your documents, your calendar entries, embedded into 1024-dimensional space and stored in a binary blob. The agent reads this blob to remember you. You cannot read the blob.

This is the design that has won the agent category, and I think it is wrong for a personal assistant.

When we designed Elyra, the constraint was simple: every byte of the agent's memory should be human-readable. No vector databases. No embedding files. No opaque caches that exist for the model's convenience. If the agent remembers something about you, you can open a file and read it.

Here is why that constraint mattered, what it cost us, and what it changed.

The Problem With Opaque Memory

A vector database is a great engineering tool. It is also a black box. When the agent retrieves "context relevant to your question," it is returning chunks selected by cosine similarity in a high-dimensional space. The agent does not know why those chunks were chosen. Neither do you. Neither does the engineer who shipped the agent.

This has three consequences for a personal assistant — none of them small.

You cannot audit what the agent thinks of you. Six months in, the agent has built up a model. It thinks you are the kind of person who likes early mornings, dislikes meetings on Mondays, and is slowly drifting toward burnout. You cannot read that model. You can only watch its outputs and guess.

You cannot correct it. The agent has decided that you are vegetarian, based on a one-line journal entry you wrote during a brief plant-based experiment three years ago. That belief is now baked into how it plans your meals. There is no obvious way to correct it. You cannot grep the embedding store for "vegetarian" and change the entry.

You cannot leave. The agent's model of you is locked in a format only the agent's vendor can read. If you want to switch products, you take your raw notes and start over. Six months of relationship with the agent — gone.

These are not exotic concerns. They are the same concerns that made open data formats matter for documents and code. We do not accept opaque storage for a Word doc anymore. We should not accept it for the agent that knows our recovery score and our medical follow-ups.

The Markdown-First Design

In Elyra, the agent's memory is a folder of markdown files in ~/elyra/data/. There are roughly five files that matter:

profile.md — who you are, what you do, what you are working on. Written by you, read by the agent on every turn.

goals.md — current goals, with last-edited dates. Updated when you tell the agent your goals have changed.

brief-history/ — every morning brief, ever, as a dated markdown file. You can read what the agent told you on March 4th and how it reasoned about your day.

tasks.db — SQLite. Yes, this is the one binary file. Tasks need a schema. But you can open it with the SQLite CLI and write SELECT * FROM tasks and read every entry. Open format, structured data.

drift-reports/ — weekly drift reviews. Markdown. You can read every weekly review the agent has ever written about you.

That is essentially the whole memory surface. There is no embedding cache. There is no vector store. The "retrieval" the agent does is a very simple keyword search across your markdown files plus the last few morning briefs as context.

What This Costs

I want to be honest about the trade-offs.

Retrieval is less semantic. A vector database can find "the conversation we had about exercise" even if the file says "lifting" instead of "exercise." A keyword search cannot. Elyra's retrieval is good enough because the data volume is small — your personal archive is megabytes, not terabytes — but it would not work for a general-purpose document search engine.

The agent forgets things you might want it to remember. A vector database can surface a one-line entry from two years ago that is suddenly relevant. Elyra does not surface that entry unless something in your current context makes the agent search the right file. We have made peace with this. Most of what an agent should remember about you is in the last six weeks, not the last six years.

You have to write things down. The agent only knows what is in the markdown. If you never write down that you are training for a marathon, the agent does not know. Cloud agents pretend to learn this passively from your behavior — which, when you look at it carefully, is not actually working very well.

These costs are real. They are also the price of being able to read what the agent knows about you, and that price is worth paying.

What This Changes

The downstream effects of markdown memory are easy to underestimate until you live with them.

Editing the agent feels like editing yourself. When the agent says something wrong about your week, you open profile.md in your text editor, fix the line, and save. The next morning's brief is corrected. There is no "training," no "fine-tuning," no opaque model update. You wrote a different sentence, the agent reads the different sentence, the behavior changes.

Version control is free. I keep my Elyra data folder in a private Git repo. Every morning brief, every weekly drift review, every change to my goals — committed. I can git diff last month's profile against this month's and see exactly how my own self-description has changed. This is the most useful journaling tool I have used.

Migration is trivial. If we ship a new version of Elyra that breaks the schema, you do not lose your archive. The markdown stays markdown. Your morning briefs stay readable. The product can be replaced; the data outlives it.

Trust is built differently. When a friend asks "how do you trust the agent with your data," I open the data folder. They see files. They read a few. The trust conversation is short.

The Right Answer Depends On The Use Case

I do not think markdown memory is right for every agent.

If you are building a sales agent that needs to recall every conversation across thousands of customers, a vector database is the right tool. The data is structured, the volume is large, and no one is going to grep it manually.

If you are building a coding agent that needs to retrieve from a million-line codebase, you want embeddings. The data is structured, the volume is large, the agent's job is to find the relevant function.

If you are building a personal assistant that holds your goals, your morning routine, your recovery, and your relationships — the data volume is small, the queries are linguistic, and the user has every right to read what the agent thinks of them. Markdown wins.

The defining question is not "is this technically optimal." It is "should the user be able to read it." For a personal assistant, the answer should always be yes.

Elyra's memory is markdown all the way down.

You Might Also Like