Workflow AutomationMarch 26, 20266 min read

Bootstrap Twenty Repos in One Run. Here's How.

Every engineering org has the same problem in a different shape. The senior engineers know where everything lives. The junior engineers do not. The docs are six months out of…

MR

Marcus Rivera

Principal Engineer

@@marcusbuilds
#claude-code#documentation#kavrynos#developer-onboarding#engineering

Every engineering org has the same problem in a different shape. The senior engineers know where everything lives. The junior engineers do not. The docs are six months out of date. The README in the most important repo is still the auto-generated one from the day it was created. Onboarding a new engineer takes three weeks of pair-coding and Slack archaeology — half of which is rediscovering what the senior engineer already knew but had not written down.

Documentation is the work everyone agrees should happen and nobody has time to do. So it does not happen. Then a new engineer joins, hits the doc gap, and the team takes a week-long velocity hit while a senior engineer plays tour guide.

The interesting development in 2026 is that this is now a five-minute problem if you point Claude Code at the right inputs. Here is the workflow we use, and what it produces.

The Repo Scanner Loop

KavrynOS ships a feature called the Repo Scanner. The mechanic is simple: point it at the parent directory that contains your projects, and it runs Claude Code on every repo it finds, generating a standard set of documentation files.

For each repo, the scanner produces:

  • CLAUDE.md — a high-density context file optimized for AI agents. Stack, key patterns, where things live, what to read first.
  • ARCHITECTURE.md — human-readable architecture overview. The five-paragraph version a new engineer would want.
  • TESTING.md — how tests are organized, how to run them locally, what coverage looks like.
  • API reference — extracted from the code, not from a separate spec doc that drifts.
  • Database schema docs — generated from migrations and ORM models, with notes on the meaningful tables.
  • Business logic notes — what the code does, in domain terms, not "this calls that."
  • .cursor/rules — repo-specific rules for Cursor users.

Twenty repos. Ten to fifteen minutes of wall time. The output is a working documentation foundation that is, at minimum, more current than what was there before.

What "Bootstrap" Actually Means

The scanner is not generating poetry. The output is workmanlike documentation — accurate, structured, readable. We deliberately tuned the prompts to avoid the two failure modes:

Marketing-flavored docs. "This codebase implements a robust, scalable solution for processing payments at world-class velocity." The prompt explicitly says: do not write like this. Describe what is, not what we think we are.

Cargo-cult sections. Every auto-generated docs tool has a "Contributing" section that says exactly the same thing across every repo. The prompt suppresses these unless the repo actually has contribution instructions worth reading.

What you get is a doc that reads like a senior engineer wrote it in twenty minutes — because that is more or less what happened. Claude has read every file in the repo. It knows the patterns. It is summarizing what it sees.

What This Replaces

I have watched three teams ship the same Confluence project: "documentation week." Everyone blocks Friday for a quarter to update docs. The first week, half the team misses because of incidents. The second week, attendance is worse. By the third week, the project is dead and the docs are worse than they were before — because the partial updates introduced inconsistencies.

The Repo Scanner replaces this ritual. Not because the AI is smarter than your senior engineers. Because the AI is available in the moment when documentation costs the least to write — right after you have just read the code.

The right time to write ARCHITECTURE.md for a service is the day after you have understood it. By next Friday, you do not remember the architecture cleanly enough to document it. The AI does not have this problem. It re-reads the code each time and writes the doc fresh.

Selective Re-Bootstrap

The feature that matters more than the initial run is selective re-bootstrap.

When you ship a major refactor, you do not want to re-run docs across twenty repos. You want to re-run docs against the one repo you changed. The scanner UI lets you select which repos to re-run, and which sections of each repo's docs to regenerate. The unchanged repos keep their existing docs. The changed repo gets fresh ARCHITECTURE.md and CLAUDE.md. Everything else stays the same.

This is the loop that finally makes "living documentation" a real thing. The docs are auto-generated, so they are easy to keep current. They live in the repo, so they version with the code. When the code changes meaningfully, you push a button and the docs catch up.

I am aware this is a sentence I have read in vendor pitches for a decade. The difference now is that the underlying generator is actually good enough.

The Knowledge Base That Comes After

Once the per-repo docs exist, KavrynOS uses them to build a product knowledge base — a single living KB file per product, generated by Claude Code across all the repos and databases that make up that product.

This is the file the Ask AI chat reads from. It is the file the MCP triage agent uses to understand which repo a ticket should land in. The per-repo docs are the foundation. The product KB is the layer above that ties them together.

The order matters. You cannot generate a useful product KB across repos that have no per-repo docs. The bootstrap is the prerequisite for everything else.

What It Doesn't Do

I want to flag the failure modes honestly.

It does not capture intent. The docs describe what the code does and how it is structured. They do not capture why the code looks the way it does — the trade-off you made in week three because the database team would not give you a foreign key. That context lives in your head, in Slack threads, or in the original PR description. It is not in the repo. The AI cannot extract what was never written down.

It does not improve a confused codebase. If your service is a tangled mess, the doc that describes it is also going to read like a tangled mess. The doc reflects the code. The fix is to fix the code, not to write better docs about it.

The first pass needs editing. About one in five generated sections has a small inaccuracy — usually a stale claim about how the code "currently" handles something, when it actually handles it a different way after a recent refactor. You should read the output before you trust it. After the first pass, the team usually edits each repo's docs by hand for fifteen minutes. The total time is still a fraction of doing it from scratch.

Try The Five-Repo Test

If you want to evaluate this for your own team without buying anything, the five-repo test is a useful starter.

Pick the five repos a new engineer would need to understand in their first month. Run Claude Code with a single instruction: "Generate ARCHITECTURE.md and CLAUDE.md for this repo. Be accurate. Do not write marketing copy." Read the output. Edit it.

If the result is dramatically better than what you have today, you do not need to be sold on this category. You will know.

KavrynOS automates this loop across every repo at once.

You Might Also Like