Compare commits

...

2 commits

3 changed files with 197 additions and 71 deletions

View file

@ -1,22 +1,25 @@
# ============================================================================= # =============================================================================
# Knowledge Genome - Makefile v. 1.0.0 # Knowledge Genome - Makefile v. 1.1.0
# Orchestrates the setup and management of the knowledge base. # Orchestrates the setup and management of the knowledge base.
# ============================================================================= # =============================================================================
include globals.env include globals.env
export $(shell grep -v '^[#[:space:]]' globals.env | sed 's/=.*//') export $(shell grep -v '^[#[:space:]]' globals.env | sed 's/=.*//')
.PHONY: setup add-genome status lint lock doctor sync help .PHONY: setup add-genome status lint lock doctor sync test verify-structure sync-structure help
help: help:
@echo "Available commands:" @echo "Available commands:"
@echo " make setup - Full system initialization" @echo " make setup - Full system initialization"
@echo " make add-genome - Register and scaffold a new genome" @echo " make add-genome - Register and scaffold a new genome [LINKED=owner/repo]"
@echo " make status - Check submodule and encryption status" @echo " make status - Check submodule and encryption status"
@echo " make lint - Verify schema, privacy flags, and metadata" @echo " make lint - Verify schema, privacy flags, and metadata"
@echo " make lock - Lock all encrypted files across all genomes" @echo " make verify-structure - Report directory drift across all genomes"
@echo " make doctor - Verify all required tools are installed" @echo " make sync-structure - Create any missing canonical dirs (safe)"
@echo " make sync - Sync submodules and report unpushed commits" @echo " make test - Run the bats test suite (no LLM/GPU needed)"
@echo " make lock - Lock all encrypted files across all genomes"
@echo " make doctor - Verify all required tools are installed"
@echo " make sync - Sync submodules and report unpushed commits"
lint: lint:
@bash scripts/lint-genomes.sh @bash scripts/lint-genomes.sh
@ -27,16 +30,26 @@ setup:
add-genome: add-genome:
@if [ -z "$(NAME)" ] || [ -z "$(DESC)" ]; then \ @if [ -z "$(NAME)" ] || [ -z "$(DESC)" ]; then \
echo "Error: NAME and DESC are required."; \ echo "Error: NAME and DESC are required."; \
echo "Usage: make add-genome NAME=my-genome DESC='My description'"; \ echo "Usage: make add-genome NAME=my-genome DESC='My description' [LINKED=owner/project-repo]"; \
exit 1; \ exit 1; \
fi fi
@bash scripts/add-genome.sh "$(NAME)" "$(DESC)" @bash scripts/add-genome.sh "$(NAME)" "$(DESC)" "$(LINKED)"
status: status:
@echo "--- Master Status ---" @echo "--- Master Status ---"
@git submodule status @git submodule status
@echo "--- Encryption Status (First 10 files) ---" @echo "--- Encryption Status (per genome) ---"
@git-crypt status | head -n 10 @git submodule foreach 'git-crypt status 2>/dev/null | head -n 10 || true'
verify-structure:
@bash scripts/verify-genomes.sh
sync-structure:
@bash scripts/verify-genomes.sh --sync
test:
@command -v bats >/dev/null 2>&1 || { echo " MISSING: bats (sudo apt install bats)"; exit 1; }
@bats tests/
doctor: doctor:
@echo "Checking required tools..." @echo "Checking required tools..."
@ -45,6 +58,7 @@ doctor:
@command -v curl >/dev/null 2>&1 || { echo " MISSING: curl"; exit 1; } @command -v curl >/dev/null 2>&1 || { echo " MISSING: curl"; exit 1; }
@command -v jq >/dev/null 2>&1 || { echo " MISSING: jq"; exit 1; } @command -v jq >/dev/null 2>&1 || { echo " MISSING: jq"; exit 1; }
@command -v bw >/dev/null 2>&1 || echo " OPTIONAL: bw (Bitwarden CLI) not found — key injection will be manual." @command -v bw >/dev/null 2>&1 || echo " OPTIONAL: bw (Bitwarden CLI) not found — key injection will be manual."
@command -v python3 >/dev/null 2>&1 || echo " OPTIONAL: python3 not found — needed for 'make test' and the ingest skill (index-append.py), not for setup."
@echo "System ready." @echo "System ready."
sync: sync:

228
README.md
View file

@ -19,16 +19,17 @@ and a human-in-the-loop Git Flow for quality control.
5. [Configuration](#configuration) 5. [Configuration](#configuration)
6. [Quick Start](#quick-start) 6. [Quick Start](#quick-start)
7. [Makefile Reference](#makefile-reference) 7. [Makefile Reference](#makefile-reference)
8. [Genome Lifecycle](#genome-lifecycle) 8. [Testing](#testing)
9. [Security Model](#security-model) 9. [Genome Lifecycle](#genome-lifecycle)
10. [Key Management](#key-management) 10. [Security Model](#security-model)
11. [Agent Sessions](#agent-sessions) 11. [Key Management](#key-management)
12. [Workflows](#workflows) 12. [Agent Sessions](#agent-sessions)
13. [Knowledge Quality](#knowledge-quality) 13. [Workflows](#workflows)
14. [Knowledge Schema](#knowledge-schema) 14. [Knowledge Quality](#knowledge-quality)
15. [Collaboration Model](#collaboration-model) 15. [Knowledge Schema](#knowledge-schema)
16. [Optional Extensions](#optional-extensions) 16. [Collaboration Model](#collaboration-model)
17. [Troubleshooting](#troubleshooting) 17. [Optional Extensions](#optional-extensions)
18. [Troubleshooting](#troubleshooting)
--- ---
@ -110,10 +111,18 @@ genome-{name}/
| Wiki | `wiki/` | LLM | Agent creates, updates, cross-links, maintains. | | Wiki | `wiki/` | LLM | Agent creates, updates, cross-links, maintains. |
| Schema | `AGENTS.md` | Human + LLM | Co-evolved contract defining structure and workflows. | | Schema | `AGENTS.md` | Human + LLM | Co-evolved contract defining structure and workflows. |
### Linked projects (optional)
A genome can optionally declare a **linked project repository** — a separate repo where
the knowledge in that genome is meant to be applied (e.g. `genome-dev` linked to an app
repo). The link is recorded as a third field in the registry and rendered into the
genome's `AGENTS.md` (`## Linked Project`). A genome with no link is _knowledge-only_ and
behaves exactly as before. See [Configuration](#configuration).
### Framework structure ### Framework structure
```text ```text
knowledge-genome-setup/ ← This repository (setup tooling) knowledge-genome-orchestrator/ ← This repository (setup tooling)
├── globals.env ← Static KEY=VALUE config (Make-includable) ├── globals.env ← Static KEY=VALUE config (Make-includable)
├── registry.sh ← Bash-only: GENOMES array + dynamic paths ├── registry.sh ← Bash-only: GENOMES array + dynamic paths
├── Makefile ← Entry point for all operations ├── Makefile ← Entry point for all operations
@ -121,6 +130,7 @@ knowledge-genome-setup/ ← This repository (setup tooling)
│ ├── output.sh ← Terminal helpers (colors, log levels) │ ├── output.sh ← Terminal helpers (colors, log levels)
│ ├── deps.sh ← Dependency validation │ ├── deps.sh ← Dependency validation
│ ├── scaffold.sh ← Template rendering engine │ ├── scaffold.sh ← Template rendering engine
│ ├── structure.sh ← Canonical genome layout (single source of truth)
│ ├── lint.sh ← Per-file validation functions │ ├── lint.sh ← Per-file validation functions
│ └── git-crypt.sh ← git-crypt lifecycle (init, export, verify, rotate) │ └── git-crypt.sh ← git-crypt lifecycle (init, export, verify, rotate)
├── providers/ ├── providers/
@ -131,18 +141,41 @@ knowledge-genome-setup/ ← This repository (setup tooling)
│ ├── setup-master.sh ← Master repo initialisation │ ├── setup-master.sh ← Master repo initialisation
│ ├── setup-genomes.sh ← Genome provisioning loop │ ├── setup-genomes.sh ← Genome provisioning loop
│ ├── add-genome.sh ← Add a single new genome │ ├── add-genome.sh ← Add a single new genome
│ └── lint-genomes.sh ← Quality control across all genomes │ ├── lint-genomes.sh ← Quality control across all genomes
└── templates/ │ └── verify-genomes.sh ← Structure verify / --sync across all genomes
├── agents-genome.md ← Per-genome agent contract template ├── templates/
├── agents-master.md ← Master coordination schema template │ ├── agents-genome.md ← Per-genome agent contract template
├── wiki-index.md ← Index template (rendered per genome) │ ├── agents-master.md ← Master coordination schema template
├── wiki-log.md ← Log template (rendered per genome) │ ├── readme-master.md ← Master repo README template
├── pr-description.md ← PR review checklist template │ ├── wiki-index.md ← Index template (rendered per genome)
├── pre-commit.sh ← Security hook template │ ├── wiki-log.md ← Log template (rendered per genome)
├── gitattributes ← Git encryption rules template │ ├── pr-description.md ← PR review checklist template
└── gitignore ← Git ignore template │ ├── pre-commit.sh ← Security hook template
│ ├── gitattributes ← Git encryption rules template
│ └── gitignore ← Git ignore template
├── skills/
│ └── ingest/ ← pi skill: deployed to the AI node (vm101)
│ ├── SKILL.md ← Semantic-only contract (read/edit, emits manifest)
│ ├── references/ ← On-demand reference docs for the agent
│ └── scripts/ ← Deterministic post-processor (runs outside the agent)
│ ├── run-ingest.sh ← Orchestrator: consumes the manifest, emits one JSON line
│ ├── slug.sh ← Slug normalisation
│ ├── index-append.py ← Sorted insert into wiki/index.md + last_updated bump
│ ├── log-append.sh ← Append a wiki/log.md entry
│ ├── scoped-lint.sh ← Lint only the pages touched this run (reuses lib/lint.sh)
│ └── open-pr.sh ← Branch / commit / push / open PR (DRY_RUN seam for tests)
└── tests/ ← bats suite — deterministic, no LLM/GPU (see Testing)
├── helpers.bash
├── scripts.bats
├── lint.bats
├── structure.bats
└── run-ingest.bats
``` ```
> The `skills/ingest/` directory is version-controlled here but **deployed** to the AI
> node (vm101) under `~/.pi/agent/skills/ingest`. The agent (`pi`) does only semantic work
> and writes a manifest; `run-ingest.sh` does the mechanical steps. See [Workflows → Ingest](#ingest).
--- ---
## System Requirements ## System Requirements
@ -156,7 +189,9 @@ All tools (git-crypt, bw, qmd) have native Linux binaries.
All scripts are compatible with macOS. Requirements: All scripts are compatible with macOS. Requirements:
- bash 3.2+ (macOS default) — fully supported. All `bash 4+` constructs removed. - bash 3.2+ (macOS default) — supported for the **setup scripts** (`make` targets, scaffolding).
The `ingest` skill uses bash 4+ constructs (`mapfile`), but it is deployed and run on the
Linux AI node, not on the macOS setup machine — so this is not a constraint in practice.
- GNU coreutils not required — BSD variants of `date`, `grep`, `sed` all handled. - GNU coreutils not required — BSD variants of `date`, `grep`, `sed` all handled.
- `git-crypt`: install via Homebrew — `brew install git-crypt` - `git-crypt`: install via Homebrew — `brew install git-crypt`
- `jq`, `curl`: pre-installed or via Homebrew - `jq`, `curl`: pre-installed or via Homebrew
@ -195,6 +230,11 @@ The system is designed for a homelab architecture:
> the index, and the log tail is a cost. This is why all agent files are token-optimised > the index, and the log tail is a cost. This is why all agent files are token-optimised
> and sessions are kept to one source at a time. > and sessions are kept to one source at a time.
> **Reference deployment:** the table above is a target profile, not a hard requirement.
> The current setup runs a single 16GB GPU (RTX 5060 Ti) with a ~9B model for interactive
> ingest, and offloads heavy/async synthesis to a cloud model. Smaller models work — they
> just make the "one source per session" discipline and the token budget matter more.
--- ---
## Prerequisites ## Prerequisites
@ -285,14 +325,17 @@ resolution. Never included by Make.
```bash ```bash
# Dynamic paths (resolved at source time) # Dynamic paths (resolved at source time)
WORK_DIR="${HOME}/knowledge-genome-setup" WORK_DIR="${HOME}/knowledge-genome-orchestrator"
KEYS_DIR="${WORK_DIR}/keys" KEYS_DIR="${WORK_DIR}/keys"
# Genome registry — format: "name|description" # Genome registry — format: "name|description|linked_repo"
# The third field is OPTIONAL:
# - leave it empty → knowledge-only genome (no linked project)
# - owner/repo → genome is linked to that project repository (rendered into AGENTS.md)
GENOMES=( GENOMES=(
"genome-dev|Web development, TUI, Angular, software architecture" "genome-dev|Web development, TUI, Angular, software architecture|myorg/my-app"
"genome-finance|Personal finance, investments, market analysis" "genome-finance|Personal finance, investments, market analysis|"
"genome-homelab|Infrastructure, network configs, architecture logs" "genome-homelab|Infrastructure, network configs, architecture logs|"
) )
``` ```
@ -315,8 +358,8 @@ export GITHUB_TOKEN="your_github_token"
```bash ```bash
# 1. Clone the setup framework # 1. Clone the setup framework
git clone <setup-repo-url> knowledge-genome-setup git clone <setup-repo-url> knowledge-genome-orchestrator
cd knowledge-genome-setup cd knowledge-genome-orchestrator
# 2. Configure your environment # 2. Configure your environment
cp globals.env.example globals.env # edit with your values cp globals.env.example globals.env # edit with your values
@ -358,16 +401,19 @@ After setup completes:
## Makefile Reference ## Makefile Reference
| Target | Description | | Target | Description |
| --------------------------------- | ------------------------------------------------------------------------------ | | ----------------------------------------------------- | ------------------------------------------------------------------------------------- |
| `make setup` | Full system initialisation — master repo + all genomes in `registry.sh` | | `make setup` | Full system initialisation — master repo + all genomes in `registry.sh` |
| `make add-genome NAME=x DESC="y"` | Scaffold and register a single new genome | | `make add-genome NAME=x DESC="y" [LINKED=owner/repo]` | Scaffold and register a single new genome (optional linked project) |
| `make lint` | Run quality checks across all genomes (schema, privacy, decay, page size) | | `make lint` | Run quality checks across all genomes (schema, privacy, decay, page size) |
| `make status` | Show submodule status and first 10 git-crypt encryption states | | `make verify-structure` | Report directory drift of each genome vs the canonical layout (`lib/structure.sh`) |
| `make lock` | Lock all encrypted repos (master + all genome submodules) | | `make sync-structure` | Create any missing canonical directories across all genomes (safe, idempotent) |
| `make doctor` | Verify required tools: git, git-crypt, curl, jq; warn if bw missing | | `make test` | Run the bats test suite (deterministic; no LLM/GPU/network) — see [Testing](#testing) |
| `make sync` | `git submodule update --init --recursive` + report unpushed commits per genome | | `make status` | Show submodule status and per-genome git-crypt encryption state |
| `make help` | Print all available targets | | `make lock` | Lock all encrypted repos (master + all genome submodules) |
| `make doctor` | Verify required tools: git, git-crypt, curl, jq; warn if bw missing |
| `make sync` | `git submodule update --init --recursive` + report unpushed commits per genome |
| `make help` | Print all available targets |
### Examples ### Examples
@ -378,6 +424,12 @@ make doctor
# Add a new genome after initial setup # Add a new genome after initial setup
make add-genome NAME=genome-research DESC="Academic papers and deep research" make add-genome NAME=genome-research DESC="Academic papers and deep research"
# Add a genome linked to a project repository
make add-genome NAME=genome-dev DESC="Web development" LINKED=myorg/my-app
# Check every genome against the canonical directory layout
make verify-structure
# Run full lint pass (bash deterministic checks) # Run full lint pass (bash deterministic checks)
make lint make lint
@ -390,6 +442,38 @@ make lock
--- ---
## Testing
The mechanical layer (slug, index, log, lint, structure, the ingest orchestrator) is
covered by a [bats](https://github.com/bats-core/bats-core) suite. The tests are
**deterministic and have zero dependency on the LLM, the GPU, or the network** — they
simulate the agent's output with fixtures and exercise the scripts directly, so they run
anywhere git + bash live (laptop, CI, a git hook). They are **not** meant to run on the AI
node or via n8n.
```bash
sudo apt install bats # once
make test # or: bats tests/
```
| File | Covers |
| ----------------- | ------------------------------------------------------------------------------ |
| `scripts.bats` | `slug.sh`, `log-append.sh`, `index-append.py` (insert, sort, bump, idempotent) |
| `lint.bats` | `lib/lint.sh` validators + `scoped-lint.sh` |
| `structure.bats` | `lib/structure.sh` report / sync |
| `run-ingest.bats` | `run-ingest.sh` end-to-end (DRY_RUN, local bare remote) — needs `jq` |
Each test builds its own throwaway genome with a local bare remote, configured to ignore
the operator's global git settings (signing, global hooks) so the suite is hermetic. The
`run-ingest` tests auto-`skip` if `jq` is absent. If you change the canonical layout in
`lib/structure.sh`, update `FIXTURE_DIRS` in `tests/helpers.bash` to match.
> Why this matters: the only non-deterministic part of the system is the model. Pinning
> the mechanical layer with tests means that when an ingest misbehaves, you know it's the
> model or the prompt — not the plumbing.
---
## Genome Lifecycle ## Genome Lifecycle
### Initial setup ### Initial setup
@ -431,6 +515,7 @@ template files:
| `{{GENOME_NAME}}` | registry.sh | `genome-dev` | | `{{GENOME_NAME}}` | registry.sh | `genome-dev` |
| `{{GENOME_NAME_UPPER}}` | derived | `GENOME-DEV` | | `{{GENOME_NAME_UPPER}}` | derived | `GENOME-DEV` |
| `{{GENOME_DESC}}` | registry.sh | `Web development...` | | `{{GENOME_DESC}}` | registry.sh | `Web development...` |
| `{{LINKED_PROJECT}}` | registry.sh | `myorg/my-app` (or `none`) |
| `{{FORGEJO_URL}}` | globals.env | `https://git.yourserver.com` | | `{{FORGEJO_URL}}` | globals.env | `https://git.yourserver.com` |
| `{{FORGEJO_USER}}` | globals.env | `yourusername` | | `{{FORGEJO_USER}}` | globals.env | `yourusername` |
| `{{VAULTWARDEN_URL}}` | globals.env | `https://vault.yourserver.com` | | `{{VAULTWARDEN_URL}}` | globals.env | `https://vault.yourserver.com` |
@ -593,9 +678,9 @@ git clone https://git.yourserver.com/yourusername/genome-dev.git
If a key is lost or compromised: If a key is lost or compromised:
```bash ```bash
# From the knowledge-genome-setup/ directory # From the knowledge-genome-orchestrator/ directory
source lib/git-crypt.sh source lib/git-crypt.sh
cd ~/knowledge-genome-setup/genome-dev cd ~/knowledge-genome-orchestrator/genome-dev
gcrypt_rotate_key "genome-dev" gcrypt_rotate_key "genome-dev"
``` ```
@ -643,7 +728,8 @@ The agent executes in this order at the start of every session:
1. Read `wiki/index.md` — primary catalog of all pages and maturity 1. Read `wiki/index.md` — primary catalog of all pages and maturity
2. Read last 20 log entries (injected by orchestrator — does NOT open `wiki/log.md` directly) 2. Read last 20 log entries (injected by orchestrator — does NOT open `wiki/log.md` directly)
3. For tasks involving related pages: `qmd search "<query>"` before opening any files 3. For tasks involving related pages: if the optional `qmd` extension is installed,
`qmd search "<query>"` before opening files; otherwise navigate from `wiki/index.md`
4. Operate on individual files — never scan entire directories 4. Operate on individual files — never scan entire directories
### One source per session ### One source per session
@ -668,7 +754,7 @@ For Forgejo webhook → automated ingest:
2. n8n receives webhook, identifies new files 2. n8n receives webhook, identifies new files
3. n8n starts one agent session per new file (sequential, not parallel) 3. n8n starts one agent session per new file (sequential, not parallel)
4. Each session: inject `tail -n 20 wiki/log.md` + `PRIVATE_CONTEXT` state + source path 4. Each session: inject `tail -n 20 wiki/log.md` + `PRIVATE_CONTEXT` state + source path
5. Agent ingest workflow runs, opens PR 5. Phase 1 agent (`/skill:ingest`) writes the manifest; Phase 2 `run-ingest.sh` opens the PR
6. Human reviews and merges PR 6. Human reviews and merges PR
--- ---
@ -677,17 +763,39 @@ For Forgejo webhook → automated ingest:
### Ingest ### Ingest
Triggered by a new file in `raw/` (manual or via webhook). Triggered by a new file in `raw/` (manual or via webhook). Ingest is split into two
phases so that the small local model spends its limited context only on judgement, and
all the deterministic bookkeeping happens outside the model's loop.
1. Read source once **Phase 1 — agent (semantic only).** The `ingest` skill gives the agent read/edit tools
2. Create `wiki/sources/<slug>.md` — summary and key points only (no shell). It:
3. Per entity (person, tool, organisation): create or update `wiki/entities/<name>.md`
4. Per concept (pattern, theory, decision): create or update `wiki/concepts/<name>.md` 1. Reads the source once
5. Check each touched page for contradictions → apply Conflict Resolution if found 2. Creates `wiki/sources/<slug>.md` — summary and key points
6. Append entry to `wiki/index.md` (bottom of relevant section — do not reorder) 3. Per entity (person, tool, organisation): creates or updates `wiki/entities/<name>.md`
7. Append log entry: `INGEST | <slug>` 4. Per concept (pattern, theory, decision): creates or updates `wiki/concepts/<name>.md`
8. Run scoped lint on pages created or modified in this session; report in PR 5. Checks each touched page for contradictions → applies Conflict Resolution if found
9. Commit on `feat/ai-ingest-<slug>`; open PR using `templates/pr-description.md` 6. Writes `.ingest-manifest.json` (the list of pages it created/modified, the model name,
a one-line reasoning, the PR summary, and any contradictions) — then **stops**
**Phase 2 — `run-ingest.sh` (deterministic, outside the agent).** The post-processor
consumes the manifest and does the mechanical work the model must not waste context on:
7. Inserts each page into the correct `wiki/index.md` section **in alphabetical order**
(`index-append.py`) and bumps the index `last_updated`
8. Appends the `INGEST | <slug>` entry to `wiki/log.md`
9. Runs scoped lint on exactly the pages touched this run (`scoped-lint.sh`, reusing
`lib/lint.sh`)
10. Commits on `feat/ai-ingest-<slug>` and opens the PR using `templates/pr-description.md`
11. Emits a single compact JSON line (status, slug, PR url, lint_clean, conflict) for n8n
The agent never runs git, never edits the index/log mechanically, and never lints — those
are deterministic and tested (see [Testing](#testing)). Invocation on the AI node:
```bash
pi --mode json -p "/skill:ingest raw/articles/<file>.md" # phase 1 → writes manifest
run-ingest.sh <genome> # phase 2 → index/log/lint/PR
```
For private sources (`PRIVATE_CONTEXT: enabled` required): For private sources (`PRIVATE_CONTEXT: enabled` required):
@ -698,7 +806,8 @@ For private sources (`PRIVATE_CONTEXT: enabled` required):
Triggered by an operator question. Triggered by an operator question.
1. `qmd search "<query>"` → identify candidate pages 1. `qmd search "<query>"` (if the optional qmd extension is installed) → identify
candidate pages; otherwise start from `wiki/index.md`
2. Read candidate pages directly (qmd already returns file paths — no intermediate index lookup) 2. Read candidate pages directly (qmd already returns file paths — no intermediate index lookup)
3. Synthesise answer with `[[wikilink]]` citations 3. Synthesise answer with `[[wikilink]]` citations
4. If answer is non-trivial: save as `wiki/queries/<slug>.md` and append to index 4. If answer is non-trivial: save as `wiki/queries/<slug>.md` and append to index
@ -974,7 +1083,8 @@ n8n (running on the storage node) can automate the ingest pipeline:
2. n8n flow identifies new files 2. n8n flow identifies new files
3. For each new file: starts one agent session (sequential — never parallel) 3. For each new file: starts one agent session (sequential — never parallel)
4. Each session receives: `tail -n 20 wiki/log.md` + `PRIVATE_CONTEXT` state + source path 4. Each session receives: `tail -n 20 wiki/log.md` + `PRIVATE_CONTEXT` state + source path
5. Agent runs ingest workflow and opens PR 5. Phase 1 — agent runs `/skill:ingest` (semantic → writes manifest); Phase 2 —
`run-ingest.sh` does index/log/lint and opens the PR, returning one JSON line to n8n
6. Human reviews the PR 6. Human reviews the PR
Key constraint: one source per session, sessions sequential. Key constraint: one source per session, sessions sequential.
@ -984,11 +1094,13 @@ Never batch multiple sources into one agent session.
If the AI compute node has an Intel NPU (e.g. Core Ultra series): If the AI compute node has an Intel NPU (e.g. Core Ultra series):
- Background tasks (embedding updates, index refresh) → Intel NPU via OpenVINO - Background/auxiliary tasks (OCR of `raw/assets/`, async summarisation, or qmd
re-indexing **if** the optional qmd extension is in use) → Intel NPU via OpenVINO
- Active reasoning sessions (ingest, query, synthesis) → GPU - Active reasoning sessions (ingest, query, synthesis) → GPU
This keeps the GPU's KV cache free for interactive work and reduces power consumption Note: the core system has no embedding pipeline (see [Core Philosophy](#core-philosophy)),
for background operations. so there is nothing to embed here — the NPU is only for auxiliary work. This keeps the
GPU's KV cache free for interactive sessions and lowers power draw for background jobs.
--- ---