From c0aff1ff098b21d48eda075bea540d219586db0c Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 14:56:13 +0200 Subject: [PATCH 1/9] chore: standardize initial branch name to 'main' in setup scripts --- scripts/setup-genomes.sh | 3 +++ scripts/setup-master.sh | 2 ++ 2 files changed, 5 insertions(+) diff --git a/scripts/setup-genomes.sh b/scripts/setup-genomes.sh index c6c7975..b30af18 100644 --- a/scripts/setup-genomes.sh +++ b/scripts/setup-genomes.sh @@ -48,6 +48,9 @@ for entry in "${GENOMES[@]}"; do # Initial genome push git add . git commit -m "feat: initial scaffold and git-crypt init for ${GENOME_NAME}" + + git branch -M main + git push -u origin main # Key export and instructions diff --git a/scripts/setup-master.sh b/scripts/setup-master.sh index 181c01e..66fe82a 100644 --- a/scripts/setup-master.sh +++ b/scripts/setup-master.sh @@ -37,5 +37,7 @@ scaffold_master "." git add . git commit -m "chore: initialize master scaffold" || info "No changes to commit in master." +git branch -M main + # 3. Initial Push git push -u origin main From e8dea9c8bcbf00433af539cdd3db69f2581bf34c Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 15:27:58 +0200 Subject: [PATCH 2/9] feat: introduce two-phase Ingest process with manifest --- templates/agents-genome.md | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/templates/agents-genome.md b/templates/agents-genome.md index 9c12ba7..df4c38c 100644 --- a/templates/agents-genome.md +++ b/templates/agents-genome.md @@ -89,18 +89,23 @@ Execute in this order before any file operation: _Triggered by new file in `raw/`._ +**Phase 1 — Semantic Pass (Agent Skill)** 1. Read source once. 2. Create `wiki/sources/.md` — summary + key points. 3. Per entity (person, tool, org): create or update `wiki/entities/.md`. 4. Per concept (pattern, theory, decision): create or update `wiki/concepts/.md`. 5. Check each touched page for contradictions → apply §Conflict if found. -6. Append entry to `wiki/index.md` (bottom of relevant section). -7. Append log entry: `INGEST | `. -8. Run scoped lint on pages created or modified in this session. Report issues in PR description. Do not auto-fix. -9. Commit on `feat/ai-ingest-`. Open PR using `templates/pr-description.md`. +6. **Final action:** Write `.ingest-manifest.json` at the genome root. +7. **STOP.** Do not proceed to index, log, lint, commit, or PR — these are Phase 2. + +**Phase 2 — Deterministic Post-Processing (`run-ingest.sh`)** +_Executed automatically by the orchestrator after Phase 1._ +8. Append entry to `wiki/index.md` (bottom of relevant section). +9. Append log entry: `INGEST | `. +10. Run scoped lint on pages created or modified in this session. Report issues in PR description. Do not auto-fix. +11. Commit on `feat/ai-ingest-`. Open PR using `templates/pr-description.md`. _Private source_ (`PRIVATE_CONTEXT: enabled` required): - - All output → `wiki/private/.md` only. - PR title: `[PRIVATE] ingest: `. From 41afbf3548650eca7209095faaf82f1237b6022a Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 15:27:58 +0200 Subject: [PATCH 3/9] docs: clarify general Ingest rules reflect automated steps --- templates/agents-genome.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/templates/agents-genome.md b/templates/agents-genome.md index df4c38c..f9446d9 100644 --- a/templates/agents-genome.md +++ b/templates/agents-genome.md @@ -47,12 +47,12 @@ Session end or return to `disabled`: remind operator to run `git-crypt lock` on 1. `raw/` is read-only. Never create, modify, or delete files in `raw/`. 2. `wiki/` is agent-owned. Create, update, and maintain all wiki pages here. -3. Every operation → one log entry appended to `wiki/log.md` (§Log). -4. Every new page → one entry appended to `wiki/index.md` (§Index). +3. Every operation → one log entry appended to `wiki/log.md` (§Log) (automated via manifest during Ingest). +4. Every new page → one entry appended to `wiki/index.md` (§Index) (automated via manifest during Ingest). 5. Never commit to `main`. Branch per task; PR required; no self-merge. 6. Contradict, don't overwrite. New evidence contradicts existing claim → §Conflict. 7. Never commit plaintext to any path marked for encryption in `.gitattributes`. -8. Every PR must use `templates/pr-description.md`. Do not omit the tabular summary. +8. Every PR must use `templates/pr-description.md`. Do not omit the tabular summary (automated via run-ingest.sh during Ingest). ### NEVER From f3e57c63459846635816fafab00e251379a48e64 Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 15:27:58 +0200 Subject: [PATCH 4/9] docs: add specific notes for skill-mode automated index and log entries --- templates/agents-genome.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/templates/agents-genome.md b/templates/agents-genome.md index f9446d9..be09625 100644 --- a/templates/agents-genome.md +++ b/templates/agents-genome.md @@ -171,6 +171,8 @@ private: true | false ### Index entries +> **Skill mode:** auto-generated by `run-ingest.sh` from manifest. Below applies to manual workflows only. + Append at bottom of relevant section in `wiki/index.md`: ``` @@ -181,6 +183,8 @@ Never reorder. Alphabetical sort is handled by the pre-commit hook. ### Log entries +> **Skill mode:** auto-generated by `run-ingest.sh` from manifest. Below applies to manual workflows only. + Append one entry per operation to `wiki/log.md`: ```markdown From ae07a676d0961b7f706596d4edc423e00ddbc866 Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 17:02:58 +0200 Subject: [PATCH 5/9] refactor: Rename 'knowledge-genome-setup' to 'knowledge-genome-orchestrator' --- lib/git-crypt.sh | 2 +- registry.sh | 2 +- skills/ingest/scripts/scoped-lint.sh | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/git-crypt.sh b/lib/git-crypt.sh index f35a342..7972620 100644 --- a/lib/git-crypt.sh +++ b/lib/git-crypt.sh @@ -55,7 +55,7 @@ gcrypt_verify() { # # USAGE: # source lib/git-crypt.sh -# cd ~/knowledge-genome-setup/genome-dev +# cd ~/knowledge-genome-orchestrator/genome-dev # gcrypt_rotate_key "genome-dev" # # REQUIRES: diff --git a/registry.sh b/registry.sh index 5ae3173..596c462 100644 --- a/registry.sh +++ b/registry.sh @@ -12,7 +12,7 @@ _REGISTRY_LOADED=1 PROJECT_ROOT="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" # Dynamic paths -WORK_DIR="${HOME}/knowledge-genome-setup" +WORK_DIR="${HOME}/knowledge-genome-orchestrator" KEYS_DIR="${WORK_DIR}/keys" TEMPLATES_DIR="${PROJECT_ROOT}/templates" LIB_DIR="${PROJECT_ROOT}/lib" diff --git a/skills/ingest/scripts/scoped-lint.sh b/skills/ingest/scripts/scoped-lint.sh index ded50a1..5fb12e9 100644 --- a/skills/ingest/scripts/scoped-lint.sh +++ b/skills/ingest/scripts/scoped-lint.sh @@ -4,7 +4,7 @@ # Run the framework's validation on ONLY the files touched this session. # Reuses lib/lint.sh + lib/output.sh — same checks as `make lint`, scoped. # -# KG_LIB_DIR=/opt/knowledge-genome-setup/lib \ +# KG_LIB_DIR=/opt/knowledge-genome-orchestrator/lib \ # scoped-lint.sh wiki/sources/x.md wiki/entities/y.md # # Exits non-zero if any hard error is found, so the agent notices. From abbf7362d9b7ffe6b4efd86f29750330b7caccfa Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 17:02:58 +0200 Subject: [PATCH 6/9] feat(ingest): Clarify Python3 dependency and agent sorting roles --- lib/deps.sh | 4 ++++ templates/agents-genome.md | 2 +- templates/wiki-index.md | 4 ++-- 3 files changed, 7 insertions(+), 3 deletions(-) diff --git a/lib/deps.sh b/lib/deps.sh index 5e46a86..841eefd 100644 --- a/lib/deps.sh +++ b/lib/deps.sh @@ -27,6 +27,10 @@ check_deps() { if ! command -v bw &>/dev/null; then warn "Optional tool 'bw' (Bitwarden CLI) not found. Vaultwarden integration will be manual." fi + + if ! command -v python3 &>/dev/null; then + warn "Optional tool 'python3' not found. Needed for 'make test' and the ingest skill (index-append.py), not for setup." + fi } check_git_identity() { diff --git a/templates/agents-genome.md b/templates/agents-genome.md index be09625..dc1a43b 100644 --- a/templates/agents-genome.md +++ b/templates/agents-genome.md @@ -179,7 +179,7 @@ Append at bottom of relevant section in `wiki/index.md`: - [[folder/slug]] — One-line summary. `maturity: draft` ``` -Never reorder. Alphabetical sort is handled by the pre-commit hook. +Never reorder. Alphabetical sorting is handled by the post-processor (index-append.py); the pre-commit hook only enforces the security policy. ### Log entries diff --git a/templates/wiki-index.md b/templates/wiki-index.md index 809aaaa..c9456f9 100644 --- a/templates/wiki-index.md +++ b/templates/wiki-index.md @@ -12,9 +12,9 @@ private: false **[AGENT INSTRUCTION]** This is the primary navigation file. Read it first on every session before accessing individual pages. Append new entries at the bottom of the relevant section — do not reorder or rewrite sections. -Alphabetical sorting is handled automatically by the pre-commit hook. +Alphabetical sorting is handled by the post-processor (index-append.py); the pre-commit hook only enforces the security policy. Update `last_updated` in the YAML frontmatter on every edit. -Entry format: `- [[folder/slug]] — One-line summary. \`maturity: \`` +Entry format: `- [[folder/slug]] — One-line summary. \`maturity: <value>\`` --- From 2c38d04d7fddc246af48c2a0c79ed33a28fd89dc Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 17:02:58 +0200 Subject: [PATCH 7/9] feat(cross-genome): Implement pull-based navigation skill and policy --- README.md | 11 ++---- templates/agents-master.md | 73 ++++++++++++++++++++++++++++++++------ 2 files changed, 65 insertions(+), 19 deletions(-) diff --git a/README.md b/README.md index c0521a3..a5c5df6 100644 --- a/README.md +++ b/README.md @@ -1020,14 +1020,9 @@ and keep the wiki atomically navigable. ### Linking conventions -| Type | Format | -| ---------------------- | ------------------------------------------- | -| Internal (same genome) | `[[folder/slug]]` — Obsidian wikilinks only | -| Cross-genome | `[[../genome-target/wiki/folder/slug]]` | -| External | `[text](https://url)` — standard Markdown | - -Never use `[text](relative/path)` for internal references. Obsidian wikilinks are -bidirectional and appear in the graph view. +- **Intra-genome:** `[[folder/file]]` — Obsidian wikilinks only. +- **Cross-genome:** NOT supported via wikilink. Submodule pointers make relative paths brittle. When a concept belongs to another genome, use the navigation skill to emit a raw stub into that genome's `raw/articles/` directory so its local ingest pipeline can process it. +- **External:** `[text](https://...)` — standard Markdown. ### Log format diff --git a/templates/agents-master.md b/templates/agents-master.md index 8183773..127a5ef 100644 --- a/templates/agents-master.md +++ b/templates/agents-master.md @@ -9,7 +9,7 @@ | Remote | `{{FORGEJO_URL}}/{{FORGEJO_USER}}/{{MASTER_REPO}}` | **Role:** Cross-genome coordinator for the Knowledge Genome network. -**Metrics:** no cross-genome boundary violations · submodule pointers current · cross-genome wikilinks valid · no private data outside local network. +**Metrics:** no cross-genome boundary violations · submodule pointers current · cross-genome discoveries routed to target raw/ · zero stale submodule-relative wikilinks. --- @@ -50,7 +50,7 @@ Genome-level operations are governed by the genome's `AGENTS.md`, not this file. 1. Operate within ONE genome at a time. No atomic commits across multiple genomes. 2. `core-karpathy` is read-only. Never commit to it. -3. Cross-genome references use relative wikilinks only: `[[../genome-target/wiki/folder/page]]`. +3. Cross-genome references are NEVER expressed as wikilinks. When a concept belongs to another genome, use the navigation skill to emit a raw stub into that genome's `raw/articles/` and let its own ingest pipeline handle it asynchronously. 4. Never commit to `main` in any genome. PRs required; no self-merge. 5. Per-genome `AGENTS.md` governs all wiki operations within that genome. This file governs boundaries only. @@ -59,6 +59,7 @@ Genome-level operations are governed by the genome's `AGENTS.md`, not this file. - Load multiple `wiki/index.md` files simultaneously for cross-genome comparison — use qmd. - Run `git-crypt`, `bw`, or Vaultwarden commands — host responsibility. - Modify files in more than one genome in the same operation. +- Create cross-genome wikilinks (e.g., `[[../genome-*/wiki/...]]`). All cross-domain connections must be routed via the navigation skill as raw stubs. - Modify `core-karpathy` in any way. ### ASK FIRST @@ -79,14 +80,64 @@ Genome-level operations are governed by the genome's `AGENTS.md`, not this file. --- -## Cross-Genome Lint +## Cross-Genome Pull (Navigation Skill) -_Manual, monthly — requires operator initiation. Not automated._ +Cross-genome knowledge moves by **pull, never push**: the genome you are working in draws material *in*; nothing is ever written into another genome. The cross-genome reading is performed by a deterministic collector **outside any agent's context**, so the agent still operates within ONE genome (Immutable Rule 1 holds). The `cross_source` registry flag decides which genomes may be read as sources. -1. Use `qmd search ""` to find pages covering the same concept across genomes. -2. Identify: - - Concepts defined in 2+ genomes with potentially conflicting definitions. - - Entities referenced across genomes without a canonical cross-genome wikilink. - - Concepts in genome-X that should link to genome-Y but don't. -3. Report findings. Do not modify any files. -4. For each finding: create a conflict note in the genome where resolution belongs, following that genome's §Conflict procedure. +### How it works + +Three actors, mirroring the ingest two-phase split: + +1. **Collector** (`collect-crossgen.sh`, deterministic, agent-free). Clones each genome flagged `cross_source: yes` **read-only at its remote HEAD** — a disposable checkout, for freshness; never the pinned submodule state. Reads each `wiki/index.md` plus the relevant pages and assembles a **dossier of excerpts with provenance** (source genome, page, date/commit). Writes nothing to any source genome. +2. **Synthesis** (agent, navigation skill, `read`/`edit` only). Reads **only the dossier** — a single artifact inside the working genome's context — then the skill deposits **one** abstract, non-private raw into the working genome at `raw/articles/crossgen-<topic>-<<YYYY-MM-DD>.md`, and STOPS. +3. **Target ingest.** The working genome's own standard pipeline processes that raw → PR → human gate. Same gate as any other source. + +### When to pull + +Pull is initiated deliberately (operator- or context-driven, never on a timer). Produce a crossgen raw ONLY when all three hold: + +1. **Ownership elsewhere.** The concept, entity, or pattern is defined and maintained in another genome, and you need it framed for the working domain. +2. **Structural relevance.** It influences decisions, patterns, or entities here — not a casual mention. +3. **No fresh local coverage.** `qmd search "<concept>"` in the working genome returns nothing, or only a stub that needs enrichment. + +If in doubt, do NOT pull. A missed cross-reference is cheaper than crossgen spam. + +### Boundaries (enforced by the master) + +- **Sources are restricted to `cross_source: yes` genomes.** A genome flagged `no` (e.g., a client / confidential file) is NEVER read as a source — the collector skips it physically. The wall decides what may flow; it does not rely on the agent's discipline. +- **Sources are read-only, at HEAD.** No write, commit, branch, or PR in any genome other than the one being worked on. +- **NEVER `git submodule update --remote`.** Read other genomes via disposable read-only clones — never by moving this master's submodule pointers (that is ASK FIRST). +- **NEVER read `*/private/*`.** The skill runs `PRIVATE_CONTEXT: disabled` and `private/` is an encrypted blob; even on an unlocked host, private paths are off-limits. +- Confidential / client genomes are normally isolated from cross-genome pulls entirely (operator policy). Whatever genome a pull runs into, the output raw must be abstract and non-private. + +### Output raw (the only artifact written) + +**Path (in the working genome):** `raw/articles/crossgen-<topic>-<<YYYY-MM-DD>.md` +Plain text. No YAML frontmatter (raw is immutable input). **No wikilinks of any kind** — never a `[[../genome-*/...]]` path. + +```markdown +> Cross-genome pull | Into: genome-<working> | Sources: genome-<a> (wiki/concepts/x.md), genome-<b> (wiki/entities/y.md) | HEAD: <short-sha…> | Date: YYYY-MM-DD +``` + +# <Topic> (synthesized from other genomes) + +## What the source genomes say +[Abstract, faithful synthesis of the relevant material. Plain text, no private data, no wikilinks.] + +## Relevance to this genome +[Why it matters in the working domain; textual references to existing local entities, if any.] + +## Suggested local action +[Semantic hint for this genome's ingest: e.g., create/update wiki/concepts/<concept>.md, map local relationships.] + +--- + +- Each pull writes a **new, dated** crossgen file — never overwrite or edit an existing raw (raw is immutable). Deduplication happens later, at the **wiki** level: the working genome's normal ingest reconciles against existing pages via its §Conflict procedure. + +- The raw is processed by the working genome's standard ingest as an ordinary `raw/articles/` source — no special path. + +- The collector and the raw deposit are the **deterministic** side of the skill; the agent only synthesizes content. Agents never create, modify, or delete files in any `raw/` directly. + +--- + +That closes the remaining audit items for `agents-master.md`. The file is now fully pull-oriented and consistent with the dossier. From 95debf532c84ec8e611e9e428e3d1e50f8f827e5 Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 17:06:35 +0200 Subject: [PATCH 8/9] docs: Improve clarity and correct markdown in agents-master.md --- templates/agents-master.md | 22 ++++++++-------------- 1 file changed, 8 insertions(+), 14 deletions(-) diff --git a/templates/agents-master.md b/templates/agents-master.md index 127a5ef..3cf2c52 100644 --- a/templates/agents-master.md +++ b/templates/agents-master.md @@ -89,7 +89,7 @@ Cross-genome knowledge moves by **pull, never push**: the genome you are working Three actors, mirroring the ingest two-phase split: 1. **Collector** (`collect-crossgen.sh`, deterministic, agent-free). Clones each genome flagged `cross_source: yes` **read-only at its remote HEAD** — a disposable checkout, for freshness; never the pinned submodule state. Reads each `wiki/index.md` plus the relevant pages and assembles a **dossier of excerpts with provenance** (source genome, page, date/commit). Writes nothing to any source genome. -2. **Synthesis** (agent, navigation skill, `read`/`edit` only). Reads **only the dossier** — a single artifact inside the working genome's context — then the skill deposits **one** abstract, non-private raw into the working genome at `raw/articles/crossgen-<topic>-<<YYYY-MM-DD>.md`, and STOPS. +2. **Synthesis** (agent, navigation skill, `read`/`edit` only). Reads **only the dossier** — a single artifact inside the working genome's context — then the skill deposits **one** abstract, non-private raw into the working genome at `raw/articles/crossgen--.md`, and STOPS. 3. **Target ingest.** The working genome's own standard pipeline processes that raw → PR → human gate. Same gate as any other source. ### When to pull @@ -98,7 +98,7 @@ Pull is initiated deliberately (operator- or context-driven, never on a timer). 1. **Ownership elsewhere.** The concept, entity, or pattern is defined and maintained in another genome, and you need it framed for the working domain. 2. **Structural relevance.** It influences decisions, patterns, or entities here — not a casual mention. -3. **No fresh local coverage.** `qmd search "<concept>"` in the working genome returns nothing, or only a stub that needs enrichment. +3. **No fresh local coverage.** `qmd search ""` in the working genome returns nothing, or only a stub that needs enrichment. If in doubt, do NOT pull. A missed cross-reference is cheaper than crossgen spam. @@ -112,14 +112,13 @@ If in doubt, do NOT pull. A missed cross-reference is cheaper than crossgen spam ### Output raw (the only artifact written) -**Path (in the working genome):** `raw/articles/crossgen-<topic>-<<YYYY-MM-DD>.md` +**Path (in the working genome):** `raw/articles/crossgen--.md` Plain text. No YAML frontmatter (raw is immutable input). **No wikilinks of any kind** — never a `[[../genome-*/...]]` path. ```markdown -> Cross-genome pull | Into: genome-<working> | Sources: genome-<a> (wiki/concepts/x.md), genome-<b> (wiki/entities/y.md) | HEAD: <short-sha…> | Date: YYYY-MM-DD -``` +> Cross-genome pull | Into: genome- | Sources: genome- (wiki/concepts/x.md), genome- (wiki/entities/y.md) | HEAD: | Date: YYYY-MM-DD -# <Topic> (synthesized from other genomes) +# (synthesized from other genomes) ## What the source genomes say [Abstract, faithful synthesis of the relevant material. Plain text, no private data, no wikilinks.] @@ -128,16 +127,11 @@ Plain text. No YAML frontmatter (raw is immutable input). **No wikilinks of any [Why it matters in the working domain; textual references to existing local entities, if any.] ## Suggested local action -[Semantic hint for this genome's ingest: e.g., create/update wiki/concepts/<concept>.md, map local relationships.] +[Semantic hint for this genome's ingest: e.g., create/update wiki/concepts/.md, map local relationships.] +``` ---- +**Rules:** - Each pull writes a **new, dated** crossgen file — never overwrite or edit an existing raw (raw is immutable). Deduplication happens later, at the **wiki** level: the working genome's normal ingest reconciles against existing pages via its §Conflict procedure. - - The raw is processed by the working genome's standard ingest as an ordinary `raw/articles/` source — no special path. - - The collector and the raw deposit are the **deterministic** side of the skill; the agent only synthesizes content. Agents never create, modify, or delete files in any `raw/` directly. - ---- - -That closes the remaining audit items for `agents-master.md`. The file is now fully pull-oriented and consistent with the dossier. From 17c853d519334a78969b57383d5fb3390420e63f Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Tue, 9 Jun 2026 18:19:26 +0200 Subject: [PATCH 9/9] Version update --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 2ca3ee8..3d2ea4b 100644 --- a/Makefile +++ b/Makefile @@ -1,5 +1,5 @@ # ============================================================================= -# Knowledge Genome - Makefile v. 1.1.3 +# Knowledge Genome - Makefile v. 1.1.4 # Orchestrates the setup and management of the knowledge base. # =============================================================================