From 5491e807e062a8d9211adf25a5fadb98c0c287d8 Mon Sep 17 00:00:00 2001 From: Matteo Cherubini Date: Fri, 19 Jun 2026 05:47:21 +0200 Subject: [PATCH] docs: Clarify ingest pipeline roles and automation --- README.md | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index d5d16d3..71715eb 100644 --- a/README.md +++ b/README.md @@ -175,6 +175,11 @@ knowledge-genome-orchestrator/ ← This repository (setup tooling) > The `skills/ingest/` directory is version-controlled here but **deployed** to the AI > node (vm101) under `~/.pi/agent/skills/ingest`. The agent (`pi`) does only semantic work > and writes a manifest; `run-ingest.sh` does the mechanical steps. See [Workflows → Ingest](#ingest). +> +> ingest-semantic.py: one schema-constrained call to local model, returns JSON. run-ingest.sh: index/log/lint/PR. +> Semantic JSON extraction → deterministic wiki conform + manifest. +> +> cp skills/ingest/\* ~/.pi/agent/skills/ingest/ after make setup. Updated via git pull on laptop, pushed to vm101 via SSH in n8n flow. --- @@ -1062,7 +1067,7 @@ grep "^## \[" wiki/log.md | grep "CONFLICT" # All conflicts grep "^## \[2026-05" wiki/log.md # Entries from a specific month ``` -The orchestrator always injects only `tail -n 20 wiki/log.md` into agent context. +ingest-semantic.py receives source text + existing entity/concept names (from index) as prompt context. The LLM never loads the full log. --- @@ -1122,6 +1127,8 @@ Note: `.obsidian/` is in `.gitignore`. Workspace and plugin settings are local ### n8n automation +n8n → SSH → ingest-semantic.py → run-ingest.sh . + n8n (running on the storage node) can automate the ingest pipeline: 1. Forgejo webhook fires on push to a genome's `raw/` directory