Brain treats your vault as a typed knowledge graph, not a bag of chunks. Documents have types (note, decision, domain, reflection, observation), links have types (semantic, tag-overlap, derived-from), and every retrieval result is uplifted by its position in that graph. This page covers the operations you use to build, maintain, and grade the graph.Documentation Index
Fetch the complete documentation index at: https://docs.khal.ai/llms.txt
Use this file to discover all available pages before exploring further.
Link generation
brain link proposes links automatically from two signals:
- Tag overlap — documents sharing 2+ tags are candidate links.
- Semantic similarity — documents with cosine similarity above threshold are candidate links.
brain_links table with a link type and a confidence score, and also materialized back into the markdown files as wiki-style [[document]] references (opt-in via --apply). Obsidian users get free bidirectional navigation; git users get a diff of which links were added.
Link generation is the single biggest driver of retrieval quality over time. A 500-doc vault with dense linking beats a 5,000-doc vault with no graph structure.
Link boost in retrieval
At query time, the retrieval pipeline runs a link-boost pass over the top-N results. For each hit, it uplifts any linked document’s score by +0.15, capped at +0.45 total per document. This means a weak match that’s linked to a strong match gets pulled into the result set — mimicking how a human researcher follows citations. See Retrieval for the scoring detail.Graph analysis
brain graph reports the structural health of the graph:
- Orphans — documents with zero in-links. Candidates for deletion or re-linking.
- Connected components — isolated subgraphs. Often indicates unrelated topics should be split into separate brains.
- Link-degree distribution — hub-and-spoke vs. mesh structure.
- Authority scores — PageRank-style scoring over the link graph.
Deduplication
brain dedup finds near-duplicate documents — the copy-paste-from-Slack pattern that silently bloats vector stores:
- Groups of documents with ≥85% semantic similarity.
- For each group: canonical pick (most-linked, highest-authority), and the others flagged for merge or deletion.
- Audit trail written to
_reflections/dedup-<date>.mdso you can review before applying.
Health checks
brain health lints the vault and reports structural issues:
- Missing embeddings (content changed but not re-embedded).
- Stale chunks (chunk config changed since ingest).
- Broken
[[links]](target doc removed or renamed). - Frontmatter contract violations (e.g., document declares
confidence: highbut has no sources). - Empty or under-populated domains.
brain health --fix to apply the auto-fixable subset (re-embed, re-chunk, repair obvious link typos).
Quality scoring
brain score returns a 4-dimension quality grade (A–F):
| Dimension | What it measures |
|---|---|
| Content | Coverage (domain breadth), depth (avg. doc length, citations), freshness (staleness of top docs) |
| Structure | Link density, orphan percentage, domain balance |
| Retrieval | Sweep-grade against the seeded question set — does search actually find the right docs? |
| Hygiene | Dedup debt, health-check pass rate, frontmatter-contract compliance |
B- on retrieval but A on content is telling you: “your content is great, but your retrieval config doesn’t find it.” Run brain sweep next.
Decisions
brain decisions surfaces, edits, and applies strategy decisions recorded over time:
_decisions/ — auditable, diff-able, git-commitable. brain auto-kb writes to this directory automatically; you can edit the files by hand and brain decisions apply to re-materialize the config.
Forgetting
brain forget removes documents with a full audit trail:
- Removed from
brain_documents,brain_chunks, andbrain_links. - Logged to
_reflections/forget-<date>.mdwith the reason and the content hash (so you can re-ingest if you change your mind). - Their outbound links re-scored against remaining docs.
A weekly graph-ops loop
For a brain that’s used actively but not viaauto-kb:
Monday: brain health + fix auto-fixable
brain health --fix --brain my-brain — keeps embeddings and chunks current.Wednesday: link pass
brain link --brain my-brain --apply — new content accumulated since last link run gets integrated.Friday: dedup + forget review
brain dedup --brain my-brain --threshold 0.85 — read the audit, apply the merges you agree with.brain auto-kb --rounds 1 once a week instead.
What’s next
Closed-loop intelligence
The one-command autonomous version of everything on this page.
Retrieval
How the graph operations above lift retrieval quality at query time.