feat: Introduce pull-based cross-genome reference mechanism

This commit is contained in:
Matteo Cherubini 2026-06-09 19:43:47 +02:00
parent 5ad338c5bf
commit 8fb0890622
2 changed files with 29 additions and 13 deletions

View file

@ -955,18 +955,25 @@ Pages have a `last_updated` field in frontmatter. During lint passes:
The agent proposes re-validation but does not change `maturity` without new source evidence.
### Cross-genome lint
### Cross-genome references
A manual, monthly operation. Not automated in CI/CD — the token cost and coordination
complexity are not justified at this scale.
Cross-domain knowledge moves by **pull, never push**: the genome you are working in draws
material _in_; nothing is ever written into another genome. There are **no cross-genome
wikilinks** — submodule pointers make relative paths brittle.
1. Operator initiates a master-repo agent session
2. Agent uses `qmd search "<concept>"` across the multi-genome index to find:
- Concepts defined in 2+ genomes with potentially conflicting definitions
- Entities referenced cross-genome without canonical cross-genome wikilinks
- Concepts in genome-X that should link to genome-Y
3. Agent reports findings — does not modify files
4. For each finding: create conflict note in the genome where resolution belongs
When the working genome needs a concept that lives elsewhere, the **navigation skill** handles
it in the same two-phase shape as ingest:
1. A deterministic collector clones the relevant genomes **read-only at HEAD** (fresh — never the
pinned submodule state) and assembles a dossier of excerpts with provenance.
2. A semantic pass reads only that dossier; the skill then deposits **one** abstract, non-private
raw into the working genome at `raw/articles/crossgen-<topic>-<date>.md`.
3. That raw goes through the working genome's normal ingest → PR → human gate, like any source.
Which genomes may be read as **sources** is gated by a per-genome `cross_source: yes|no` flag: a
confidential genome (e.g. a client file) is marked `no` and is never read as a source — the wall
is structural, not a matter of the agent's discipline. The master `AGENTS.md` holds the full
boundary contract.
---
@ -1025,7 +1032,7 @@ and keep the wiki atomically navigable.
### Linking conventions
- **Intra-genome:** `[[folder/file]]` — Obsidian wikilinks only.
- **Cross-genome:** NOT supported via wikilink. Submodule pointers make relative paths brittle. When a concept belongs to another genome, use the navigation skill to emit a raw stub into that genome's `raw/articles/` directory so its local ingest pipeline can process it.
- **Cross-genome:** NOT supported via wikilink — submodule pointers make relative paths brittle. When the working genome needs a concept that lives elsewhere, the navigation skill **pulls it in** as one abstract raw under _this_ genome's `raw/articles/`, which then goes through normal ingest. See [Cross-genome references](#cross-genome-references).
- **External:** `[text](https://...)` — standard Markdown.
### Log format

View file

@ -190,12 +190,21 @@ check_broken_links() {
local links
links=$(grep -oE '\[\[[^\]]+' "$file" 2>/dev/null | sed 's/^\[\[//' | cut -d'|' -f1)
for link in $links; do
# Cross-genome links (../other-genome/…) are not resolvable from a single
# genome checkout and are skipped — they would always fall
# through the two-level lookup and produce non-actionable warnings.
while IFS= read -r link; do
[[ -z "$link" ]] && continue
if [[ "$link" == ../* ]]; then
continue
fi
local target="$link"
[[ "$target" != *.md ]] && target="${target}.md"
if [[ ! -f "${base_dir}/${target}" && ! -f "${base_dir}/../${target}" ]]; then
warn "Potential broken link: [[$link]] in $file"
fi
done
done <<< "$links"
}