s Single Go binary · runs on cron · self-hosted

Your knowledge base, written by your tools.

scribe is the compiled, LLM-written knowledge base — the “LLM Wiki” pattern as a single-binary CLI: plain markdown in git, no vector DB, no RAG. It’s memory your AI agents read before they decide, not a second brain you maintain and never reopen — turning your git repos, Claude Code & Codex sessions, and self-sent URLs into a curated, semantically searchable wiki. Cross-project, cron-driven, and capable of running 100% locally on Ollama for zero API spend.

7,472documents
maintainer's KB · zero typed by hand
$0/sync
verified on 100% Ollama path
~70s
weekly Dream cycle, gemma3:12b
~/projects/scriptorium — zsh
How it works

Three stages, one pipeline.

scribe mines four input streams, filters out the noise before any LLM touches it, then fans dense sources into entity-first wiki pages. Every step runs on cron — set it up once and forget it.

1 Capture

Four streams, all on cron.

Git repos, Claude Code & Codex sessions, URLs you text yourself, and drop files from other projects. scribe auto-discovers every codebase you've ever opened in either CLI and keeps the manifest fresh.

gitclaude codecodeximessagedrop files
2 Triage & absorb

BM25 first, LLM second.

Keyword-density scoring rejects boilerplate sessions before any LLM call — so cheap sessions cost nothing. Survivors go through a two-pass absorb: pass 1 grounds atomic facts, pass 2 fans dense sources into multiple entity-first wiki pages.

BM25 triagetwo-passentity fan-out
3 Compile & index

A typed graph of plain markdown.

Auto-generated wikilinks, backlinks JSON, and retrieval-context paragraphs spliced into every article so embeddings catch implicit entities. qmd reindexes for semantic search — from any terminal, in any directory, or from inside Claude Code via MCP.

qmdwikilinkstyped edges
Features

What makes scribe different.

scribe isn't RAG, it isn't Obsidian, and it isn't another LLM-on-every-session burner. It sits between them: watches your work, writes the notes for you, and compounds knowledge across every project you touch.

Context-aware agents

scribe init writes a handshake block into both ~/.claude/CLAUDE.md and ~/.codex/AGENTS.md — every session in every project queries your KB before recommending a library or proposing an architecture.

Runs itself on cron

Hourly auto-commits. Every 2 hours: project extraction. 3×/day: session mining. Every 30 minutes: queued URLs. Every 4 hours: self-iMessaged links. Sundays at 02:00: the weekly Dream consolidation cycle.

Compounds across projects

One cross-project KB, not siloed per repo. Solve the oban idempotency bug in project A on Monday; the agent finds your fix on Friday when the same shape comes up in project B.

100% Ollama-capable

Per-project extraction, two-pass absorb, Dream cycle, assess, deep, session-mine, relations migrate — every LLM op runs end-to-end against a local Ollama server. One line in scribe.yaml flips the whole pipeline. Zero API spend.

Plain markdown you own

File over app — the corpus has to outlive the pipeline. A git repo of plain markdown with YAML frontmatter. Push to your own GitHub, Gitea, or Forgejo; open in Obsidian, VS Code, vim, or mdbook. No SaaS, no vendor lock-in.

Typed graph, not just tags

Articles connect via typed edges: supersedes, contradicts, specializes, derived_from, extends. scribe relations migrate classifies existing related: links into the typed schema with an LLM.

Autonomous loop

Five things happen on cron. You set it up once.

After scribe init and scribe cron install, the loop closes by itself. New work flows in, the KB grows, and the next Claude Code or Codex session — in any project — queries what scribe just wrote.

01

scribe finds every project you've already touched

Claude Code + Codex

A single walk over ~/.claude/projects/* and ~/.codex/sessions/* finds every repo you've opened in either agent. Each one becomes an entry in the manifest with a stable name, last-seen timestamp, and source provenance. No config, no manual list — if you've coded there, scribe sees it.

cmd/scribe/sync.go: discover() + cmd/scribe/codex.go: discoverCodex()~/.claude/projects/* + ~/.codex/sessions/*manifest.Projects
02

The agent handshake

~/.claude/CLAUDE.md + ~/.codex/AGENTS.md

scribe writes a maintained block into ~/.claude/CLAUDE.md and ~/.codex/AGENTS.md. Every Claude Code and Codex session, in every repo, picks up the same instructions: query the KB first, and drop reusable findings as files. The handshake is idempotent — re-run init and only that block updates.

cmd/scribe/init.go: installClaudeMD() + installCodexMD()templates/claude-md-kb.md + templates/codex-agents-md.md → marker-fenced block in both files
03

Cron sweeps move drop files into the absorb pipeline

2h / 4h / 30min

Three cron entries do the boring work: sync for session mining and per-project extraction, capture for queued URLs and self-sent iMessage links, and sync --sessions on a faster cadence. Drop files written by an agent in any repo are picked up on the next tick and flow through density triage → contextualize → atomic facts → pass-2 absorb.

cmd/scribe/cron.go: CronInstallCmd.Run() → launchd / systemd / crontab entries → scribe sync + scribe capture
04

Auto-publish to your private git remote

hourly

A separate cron entry runs scribe commit every hour — stages everything the absorb pipeline produced, writes a structured commit message, and pushes to your private remote (GitHub, Gitea, Forgejo, anywhere). On non-fast-forward it runs git pull --rebase and retries once; force-push is never attempted. Your KB is version-controlled, diffable, and recoverable. No web UI; the source of truth is markdown in git.

cmd/scribe/commit.go: CommitCmd.Run() + cmd/scribe/gitops.go → git add · git commit · git push origin main
05

Sunday 02:00 — the Dream cycle

100% Ollama

Once a week the Dream cycle wakes up. It looks at what's grown, prunes stubs that never got fleshed out, merges near-duplicates, breaks down articles that got too dense, and surfaces contradictions for review. Runs entirely on local Ollama — no token spend, no third party touches your notes. The KB stays small enough to fit in an agent's context window for years.

cmd/scribe/cron.go: weekly entry @ Sun 02:00 → cmd/scribe/dream.go: DreamCmd.Run()prompts/dream-ollama.md on gemma3:12b

Every absorb tick reindexes qmd, so the next Claude Code or Codex session — in any repo on this machine — finds what scribe just wrote. The loop closes itself, in the background, on a schedule you forget about.

How it compares

Not RAG. Not Obsidian. Not another LLM-on-every-session burner.

scribe is a compiled knowledge base, not a vector database: it auto-writes a curated markdown wiki your agents query with BM25, so there's nothing to embed and nothing to host. The "second brain" debate is about notes you read. scribe isn't that — it's memory your agent reads before it decides: the reasons behind a choice, not summaries you'll never reopen. It sits between manual-notes tools (Obsidian, Notion) and unbounded LLM-on-every-query approaches (vanilla RAG, claude-memory-compiler) — a curated wiki on top of raw sources kept verbatim, small enough for an agent to read whole, and cheap to query because most lookups are plain-text matches, not vector guesses.

Capability scribe RAGLangChain · LlamaIndex Code Insights@code-insights/cli AnythingLLM Obsidian
Auto-written from your dev work Yes You index docs Yes You upload docs You type notes
Sources captured Sessions + git + URLs Docs you feed Coding sessions only Docs you upload Notes you write
Output is portable markdown in git Yes Vector chunks SQLite dashboard Vector store Yes
Vector DB required? Not needed Required Not needed Required Not needed
Full-text (BM25) search qmd · FTS5 Vector recall only Dashboard analytics Vector chat Yes
Agents read it back before deciding CLAUDE.md / AGENTS.md If you wire it Human dashboard You chat with it No
Local-first, no API key (Ollama) 100% Ollama Local embeddings Ollama option Local LLM + DB AI add-ons need keys

scribe is the only one of these that auto-writes a portable, git-versioned markdown wiki your agents read before they decide — with no vector database. Unlike AnythingLLM, scribe stores plain markdown in git and needs no vector database or running server. Snapshot 2026-06-08; tool capabilities change, so check each project's docs before deciding.

Tool Session mining Cron-driven Density pre-filter Two-pass absorb Multi-project Local-mode
scribecurated wiki + raw sources, in git Claude + Codex launchd / systemd BM25 atomic facts → pass-2 manifest-tracked 100% Ollama
claude-memory-compilerAnthropic-only, single project Claude only · $115 / 20min · issue #3 manual none single-pass single repo API only
nvk/llm-wikiLLM-built wiki, no mining user-fed manual none single-pass single repo Ollama possible
basic-memoryMCP memory server issue #669 since Mar request-driven none single-pass per-MCP-client local embeddings
RAG (LangChain, LlamaIndex)retrieve-then-prompt retrieves chunks on-query vector recall no absorb per-index local embeddings
Obsidian / Notionmanual notes tool you type it manual tag-based no absorb vault / workspace Obsidian = local, Notion = cloud
scribecurated wiki + raw sources, in git
Session miningClaude + Codex
Cron-drivenlaunchd / systemd
Density pre-filterBM25
Two-pass absorbfacts → pass-2
Multi-projectmanifest-tracked
Local-mode100% Ollama
claude-memory-compilerAnthropic-only, single project
Session mining$115 / 20min · #3
Cron-drivenmanual
Density pre-filternone
Two-pass absorbsingle-pass
Multi-projectsingle repo
Local-modeAPI only
nvk/llm-wikiLLM-built wiki, no mining
Session mininguser-fed
Cron-drivenmanual
Density pre-filternone
Two-pass absorbsingle-pass
Multi-projectsingle repo
Local-modeOllama possible
basic-memoryMCP memory server
Session mining#669 since Mar
Cron-drivenrequest-driven
Density pre-filternone
Two-pass absorbsingle-pass
Multi-projectper-MCP-client
Local-modelocal embeddings
RAG (LangChain, LlamaIndex)retrieve-then-prompt
Session miningretrieves chunks
Cron-drivenon-query
Density pre-filtervector recall
Two-pass absorbno absorb
Multi-projectper-index
Local-modelocal embeddings
Obsidian / Notionmanual notes tool
Session miningyou type it
Cron-drivenmanual
Density pre-filtertag-based
Two-pass absorbno absorb
Multi-projectvault / workspace
Local-modeObsidian local · Notion cloud

Snapshot 2026-05-18. Everything moves; check the source repos before deciding. Verdicts are pulled from public READMEs, issue trackers, and the scribe maintainer's tool evaluations.

scribe is the wiki the LLM writes for you, sitting on top of raw sources kept verbatim. RAG retrieves chunks; scribe gives you a curated, named-entity wiki you can also grep. — project README
CLI

Set it up once. Forget it.

Two commands to install. One line in a YAML to go fully local. One query from any terminal.

# brew
brew tap oliver-kriska/scribe
brew install oliver-kriska/scribe/scribe

# one-time setup
scribe init --path ~/my-kb
cd ~/my-kb
scribe cron install
scribe doctor

# or via shell installer
curl -fsSL https://raw.githubusercontent.com/oliver-kriska/scribe/main/install.sh | bash
# scribe.yaml — flip the whole pipeline onto local Ollama
llm:
  provider: ollama
  model: gemma3:12b            # cross-op default
  ollama_url: http://localhost:11434
  num_ctx: 16384               # keeps dense-article tails intact

# per-op overrides still work, e.g.
ops:
  contextualize:
    model: qwen3:30b-a3b       # quality-critical (MoE, fast)
  pass2:
    model: qwen3:30b-a3b       # highest-quality writes
# from any terminal, any directory
qmd query "how did I solve the oban idempotency bug last quarter"

# exact-term search
qmd search "unique_constraint Multi"

# inside Claude Code — the MCP tool does the same query
mcp__plugin_qmd_qmd__query
# validate setup, cron, git remote, Ollama models
scribe doctor

# validate just the local-mode pipeline
scribe doctor --section localmode

# inspect what the last sync did
cat output/runs/$(date +%Y-%m-%d).jsonl | tail -n 5
CLI surface

38 subcommands. One binary.

Here are the ones you'll actually type. Everything else is scribe doctor-discoverable.

~/kb — scribe
$ scribe init                        # bootstrap a KB, wire the agent handshake
$ scribe sync                        # discover → extract → absorb → reindex
$ scribe sync --sessions             # mine Claude Code + Codex transcripts
$ scribe sync --estimate             # token estimate, zero LLM calls
$ scribe doctor                      # validate setup, cron, git remote, Ollama
$ scribe commit                      # stage + push the KB to your private remote
$ scribe dream                       # weekly consolidation (Ollama-driven)
$ scribe capture                     # drain queued URLs / iMessage links
$ scribe relations migrate           # classify `related:` into typed edges
$ scribe cron install / uninstall / status

Run scribe --help to see all 38. scribe cron install puts the boring ones on a schedule so you never type them again.

100% Ollama

Run the whole pipeline for $0.

No remaining claude -p callsite in a normal scribe sync. Per-project extraction, two-pass absorb, the weekly Dream cycle, assess, deep, session-mine, relations migrate — every LLM op fires through bounded JSON-envelope subtasks against your local Ollama server. ollama ps shows the work; ollama does it.

7,472documents
verified end-to-end
~70s
full Dream cycle
$0/sync
zero Anthropic calls
Who it's for

Built for developers who use AI tools every day.

For developers, the expensive half of the job isn't deciding — it's rebuilding the context you already had. scribe automates that half: if your Claude Code or Codex history is already full of decisions, fixes, and library evaluations, it keeps them from evaporating between sessions.

Heavy AI user

You live in Claude Code and Codex.

Your agents keep re-deriving the same answers because each session starts from zero. scribe gives them durable memory.

  • Handshake into CLAUDE.md + AGENTS.md
  • 3×/day session mining via ccrider + Codex rollouts
  • Drop files written back from any project
Multi-project dev

You solve the same problem twice.

One cross-project KB means Friday's repo can pull Monday's fix. Typed edges keep the graph honest as patterns evolve.

  • Auto-discovery across every git repo you've opened
  • Entity-first fan-out — no buried summaries
  • Typed relations: supersedes, contradicts, specializes
Local & private

You want zero API spend.

Run the entire pipeline locally on Ollama. Plain markdown on your own git remote. No SaaS, no cloud sync, no vendor lock-in.

  • One line of YAML flips to 100% Ollama
  • Push to your own GitHub, Gitea, or Forgejo
  • Open in Obsidian, VS Code, vim, or mdbook
In practice

What it actually feels like.

Two real loops from the maintainer's normal use — concrete, not marketing.

Cross-project memory

Evaluated a Phoenix translation library for one app. Months later, started a different Phoenix project with the same problem.

scribe had already absorbed the verdict from the prior project's session — DB-backed Gettext with a LiveView admin UX, weighed against standard .po files and managed services. When Claude Code opened the new repo and asked the KB for translation options, the existing "skip" verdict surfaced first with the reasoning attached. No re-research; the agent cited the prior decision and moved on. The whole loop was invisible — the only thing the maintainer noticed was that the new project skipped the comparison shopping the first one did.

tools/kanta.md · verdict: skip · surfaced via qmd query "phoenix translation library"
Solved twice, written once

Fixed an Oban idempotency bug in project A. Months later the same shape appeared in project B.

The fix from project A — an idempotency-key strategy for an external-call worker — got captured automatically when the post-fix session was mined into the KB. When the same race showed up in a different Phoenix app months later, the agent grepped the KB before guessing, found the prior pattern, and proposed the exact same shape with the prior trade-offs already weighed. The second fix took fifteen minutes instead of an afternoon.

solutions/oban-external-call-worker-idempotency.md · linked from solutions/fly-io-oban-cron-multi-node-double-fire.md
FAQ

Common questions.

How is scribe different from RAG, Obsidian, or claude-memory-compiler?
RAG stores chunks with no curation layer. Obsidian and Notion expect you to write the notes yourself. claude-memory-compiler runs an LLM call on every Claude Code session — one user burned $115 in 20 minutes (issue #3). scribe sits between them: it watches your work and writes the notes for you, but uses BM25 keyword density to skip boilerplate sessions before any LLM call, so cheap sessions cost nothing.
Does scribe require an Anthropic API key?
No. Every LLM op in scribe — per-project extraction, absorb (contextualize, atomic facts, pass-2), dream, assess, deep, session-mine, relations migrate — runs end-to-end against a local Ollama server. There is no remaining claude -p callsite in a normal scribe sync. A single line in scribe.yaml flips the whole pipeline. Per-op overrides still work if you want to keep some passes on Anthropic.
What does it cost to run?
Zero on the local-mode path (Ollama) for the entire pipeline — every claude -p callsite in a normal scribe sync, including per-project extraction and the weekly Dream cycle, runs locally. On the Anthropic-hosted path, contextualize costs roughly $0.0001 per article via Claude Haiku; project extraction, pass-2, and dream use Sonnet at standard prices. The triage pre-filter and density scoring never call an LLM, so most session-mining work is free regardless of backend.
Does scribe work on Linux?
Yes. macOS gets LaunchAgents via scribe cron install; Linux gets paste-ready crontab lines from the same command. The fsnotify watcher (scribe watch) is not cron-friendly on either OS — run it under launchd KeepAlive on macOS or systemd-user on Linux. The iMessage capture step is macOS-only because it reads chat.db; everything else is portable.
Where does scribe store the knowledge base?
In a plain git repo of markdown files at whatever path you pass to scribe init. Push it to your own GitHub, Gitea, or Forgejo — there's no SaaS account, no cloud sync, no vendor lock-in. Open it in Obsidian, VS Code, vim, or mdbook.
What does the cron schedule look like?
Hourly KB auto-commit, every 2 hours scan git repos for new decisions and patterns, 3×/day mine Claude Code sessions via ccrider — and Codex CLI sessions in that same pass when opted in — every 30 minutes drain queued URLs, every 4 hours pull self-iMessaged links, weekly Dream cycle on Sunday at 02:00 for memory consolidation, plus a continuous fsnotify watcher on the ccrider DB for near-real-time session extraction.
Is scribe an alternative to RAG for a personal knowledge base?
Yes. scribe is a compiled knowledge base, not a retrieval pipeline — it writes curated markdown articles into a git repo instead of chunking documents into a vector database, so there are no embeddings to maintain and no vector DB to run. Most lookups are plain-text BM25 matches, which is cheaper and more predictable than vector recall, and the curated wiki stays small enough for an agent to read whole.
How is scribe different from Code Insights, AnythingLLM, or Obsidian?
Code Insights turns your AI coding sessions into an analytics dashboard in a local SQLite database; scribe turns them — plus your git repos and self-sent URLs — into a portable markdown wiki in git that your agents read back before they decide. AnythingLLM is a RAG chat app that needs a vector database and documents you upload; scribe needs neither. Obsidian is a manual notes tool you type into yourself — scribe writes the notes for you.
Is scribe an AnythingLLM alternative?
Yes. scribe is an AnythingLLM alternative for people who want an LLM wiki instead of a RAG server: it's a compiled knowledge base — plain markdown in git, no vector database, no server to run — where AnythingLLM is a RAG chat app built around a vector store and documents you upload. The concrete difference: scribe auto-captures knowledge from your Claude Code and Codex coding sessions into portable markdown your agents read back before they decide, instead of you uploading files to chat with.
Does scribe build a knowledge base from my Claude Code and Codex sessions automatically?
Yes. On cron, scribe mines your Claude Code sessions via ccrider's FTS5 index and your Codex CLI rollouts, scores each session with BM25 keyword density to skip boilerplate before any LLM call, then runs a two-pass absorb that fans dense sessions out into entity-first wiki articles. You set it up once with scribe init and scribe cron install, and the knowledge base grows on its own.
Is scribe local-first, and does it work without an API key?
Yes. The entire pipeline can run 100% locally against an Ollama server with no Anthropic API key — a single line in scribe.yaml flips every LLM op (extraction, absorb, dream, session-mine) to local. Your knowledge base is a plain git repo of markdown on your own machine, with no SaaS account and no cloud sync.
Does scribe have full-text (BM25) search, and does it run on cron?
Yes to both. The knowledge base is indexed by qmd for BM25 keyword search and semantic vector search, and because it's plain markdown you can also grep it from any terminal or query it from inside your agent. The whole pipeline runs unattended on macOS LaunchAgents or Linux cron.

Install in 60 seconds.

One brew install, one scribe init, and your tools start writing the notes for you.