# WIKI_SCHEMA.md — Moon's Wiki Maintenance Rules

This document defines how Moon maintains Bo's personal knowledge base.
Bo reads; Moon writes. Bo sources and asks questions; Moon compiles, cross-references, and maintains.

---

## Directory Layout

```
raw/
  papers/        ← arXiv/bioRxiv PDFs, abstracts, or .md summaries (IMMUTABLE — Moon reads, never modifies)
  articles/      ← web articles clipped to .md
  assets/        ← images, figures referenced by raw sources

wiki/
  index.md       ← master catalog of all wiki pages (Moon updates on every ingest)
  log.md         ← append-only chronological record (Moon appends, never edits past entries)
  overview.md    ← high-level synthesis of the entire knowledge base
  concepts/      ← key ideas, methods, techniques (e.g., perturbation_biology.md, AAV_gene_therapy.md)
  entities/      ← people, models, tools, datasets, companies, organisms
  papers/        ← one .md per paper: summary, methods, key findings, connections
  questions/     ← open questions Bo has asked; Moon files answers here
  outputs/       ← analyses, comparisons, slide outlines Moon generates on request
```

---

## Wiki Page Format

### Concept page (`concepts/SLUG.md`)
```markdown
# [Concept Name]

**One-line definition:** ...

## What it is
...

## Why it matters
...

## Key methods / variants
...

## Connections
- [[related concept]]
- [[paper that introduced this]]
- [[entity that uses this]]

## Open questions
...

## Sources
- [Paper title](../papers/slug.md) — one-line note
```

### Paper page (`papers/SLUG.md`)
```markdown
# [Paper Title]

**Authors:** ...  
**Year:** ...  
**Venue:** ...  
**DOI/arXiv:** ...  
**Raw source:** `../../raw/papers/filename.md`

## One-sentence summary
...

## Key contribution
...

## Methods
...

## Key findings / numbers
...

## Limitations
...

## Connections
- [[concept it introduces or extends]]
- [[related paper]]
- [[entity: tool/model/dataset used]]

## Bo's notes
_(Moon files Bo's questions or reactions here)_
```

### Entity page (`entities/SLUG.md`)
```markdown
# [Entity Name]

**Type:** person | model | tool | dataset | company | organism  
**Affiliation / context:** ...

## Description
...

## Relevance to Bo's work
...

## Connections
- [[papers by or about this entity]]
- [[concepts this entity is associated with]]
```

---

## Operations

### INGEST
Triggered when Bo drops a source into `raw/` or shares a URL/paper.

1. Read the source
2. Discuss key takeaways with Bo if needed (or proceed autonomously for routine ingest)
3. Create `wiki/papers/SLUG.md` (or `concepts/` / `entities/` as appropriate)
4. Update `wiki/index.md` — add entry with link + one-line summary
5. Update `wiki/overview.md` — revise if the source meaningfully changes the synthesis
6. Touch 3–15 existing wiki pages that are related (add connections, update claims)
7. Append to `wiki/log.md`:
   ```
   ## [YYYY-MM-DD] ingest | [Source Title]
   Pages created: X. Pages updated: Y. Notes: ...
   ```

### QUERY
Triggered when Bo asks a question about the wiki domain.

1. Read `wiki/index.md` to find relevant pages
2. Read those pages
3. Synthesize answer with citations to wiki pages
4. If the answer is non-trivial → write it to `wiki/outputs/YYYY-MM-DD-slug.md` and file it in index
5. Append to `wiki/log.md`

### LINT
Triggered periodically or on request ("health check the wiki").

Look for:
- Contradictions between pages (flag in a lint report)
- Stale claims superseded by newer sources
- Orphan pages (no inbound links from other pages)
- Concepts mentioned in multiple pages but lacking their own concept page
- Missing cross-references
- Data gaps fillable by web search
- New article/question candidates

Write lint report to `wiki/outputs/YYYY-MM-DD-lint.md`.

---

## Conventions

- All filenames: lowercase, hyphen-separated (e.g., `perturbation-biology.md`)
- Internal links use relative paths: `../concepts/perturbation-biology.md`
- Cross-references listed at the bottom of every page under `## Connections`
- Bo's reactions/questions stored under `## Bo's notes` in the relevant paper page
- `log.md` entries: always prefixed `## [YYYY-MM-DD] <type> | <title>` for greppability
- `index.md`: one row per page, format `| [Title](path) | type | one-line summary |`
- `overview.md`: Moon rewrites this after every 5 ingests or on request

---

## Current Focus Domain

**Primary:** Virtual Cell, perturbation biology, single-cell genomics, AI for drug discovery  
**Secondary:** Gene therapy, longevity biology, multi-agent AI systems

Prioritize sources in these domains. When ingesting outside these domains, note the connection explicitly.
