Merge branch 'feat/vm101-ingest-wrapper'
This commit is contained in:
commit
9eeb340de4
3 changed files with 139 additions and 0 deletions
60
deploy/vm101/README.md
Normal file
60
deploy/vm101/README.md
Normal file
|
|
@ -0,0 +1,60 @@
|
|||
# deploy/vm101
|
||||
|
||||
System artifacts deployed to **vm101** (the GPU ingest node). The repo is the
|
||||
source of truth; the live copies live in `/usr/local/bin/`. Edit here, then
|
||||
`sudo ./install.sh` on vm101 to push changes.
|
||||
|
||||
## Contents
|
||||
|
||||
- `n8n-pi-wrap` — forced-command wrapper that fronts every n8n→vm101 SSH call.
|
||||
- `install.sh` — installs the wrapper(s) into `/usr/local/bin` (idempotent).
|
||||
|
||||
## n8n-pi-wrap
|
||||
|
||||
The only entry point for the `n8n-runner` identity onto vm101. n8n never gets a
|
||||
shell here: whatever it sends arrives as `SSH_ORIGINAL_COMMAND`, and a `case`
|
||||
whitelist decides what runs. Anything outside the whitelist is denied and logged.
|
||||
|
||||
Allowed commands:
|
||||
|
||||
| Command | What it does |
|
||||
|---|---|
|
||||
| `pi run` | one-shot prompt via stdin (proof-of-life / health) |
|
||||
| `pi ingest <genome> <raw_path>` | the real two-phase ingest (below) |
|
||||
| `ollama list` / `ollama ps` | model introspection |
|
||||
|
||||
### The two-phase ingest
|
||||
|
||||
`pi ingest` runs the clean-start + two phases, then stops:
|
||||
|
||||
1. **Clean start** — `git fetch && switch <INGEST_BASE> && reset --hard origin/<base>`.
|
||||
Destroys only vm101's *scratch* checkout (never a shared branch, never a
|
||||
force-push) — this determinism is by design.
|
||||
2. **Semantic** — `skills/ingest/scripts/ingest-semantic.py <genome> <raw_path>`
|
||||
drives `pi` to WRITE `wiki/*` pages + `.ingest-manifest.json`.
|
||||
NOTE: this is the script, NOT `pi -p "/skill:ingest ..."` (that form makes the
|
||||
model reply in chat and write nothing — the classic "manifest not found" trap).
|
||||
3. **Mechanical** — `skills/ingest/scripts/run-ingest.sh <genome>` validates the
|
||||
manifest, then index/log/scoped-lint/commit on `feat/ai-ingest-<slug>` and opens
|
||||
a PR onto `<INGEST_BASE>`. Emits one JSON line `{status,slug,pr_url,...}`.
|
||||
|
||||
The PR then waits for the human gate. One raw per session, sequential.
|
||||
|
||||
### Input hardening
|
||||
|
||||
Both inputs come from `SSH_ORIGINAL_COMMAND`, so both are validated:
|
||||
|
||||
- `genome` — kebab lowercase `^[a-z0-9-]+$`.
|
||||
- `raw_path` — must be under `raw/`, no `..` traversal, restricted charset
|
||||
`[A-Za-z0-9._/-]`, and the file must exist. Rejected paths return a JSON error.
|
||||
|
||||
Config (`INGEST_BASE`, `GENOMES_ROOT`, `INGEST_MODEL`, Forgejo token) is sourced
|
||||
from `~/.config/knowledge-genome.env` (0600, owner-only).
|
||||
|
||||
## Install / update
|
||||
|
||||
```bash
|
||||
# on vm101
|
||||
cd ~/knowledge-genome-orchestrator/deploy/vm101
|
||||
sudo ./install.sh
|
||||
```
|
||||
8
deploy/vm101/install.sh
Executable file
8
deploy/vm101/install.sh
Executable file
|
|
@ -0,0 +1,8 @@
|
|||
#!/bin/bash
|
||||
# deploy/vm101/install.sh — install vm101 wrappers from repo -> /usr/local/bin (idempotent).
|
||||
# Run ON vm101 with sudo: sudo ./install.sh
|
||||
set -euo pipefail
|
||||
here="$(cd "$(dirname "$0")" && pwd)"
|
||||
install -m 0755 "${here}/n8n-pi-wrap" /usr/local/bin/n8n-pi-wrap
|
||||
echo "installed: /usr/local/bin/n8n-pi-wrap"
|
||||
bash -n /usr/local/bin/n8n-pi-wrap && echo "syntax: ok"
|
||||
71
deploy/vm101/n8n-pi-wrap
Executable file
71
deploy/vm101/n8n-pi-wrap
Executable file
|
|
@ -0,0 +1,71 @@
|
|||
#!/bin/bash
|
||||
set -eu
|
||||
cmd="${SSH_ORIGINAL_COMMAND:-}"
|
||||
case "$cmd" in
|
||||
"pi run")
|
||||
logger -t n8n-pi-wrap "ok: pi run (prompt via stdin)"
|
||||
prompt=$(cat)
|
||||
exec /usr/local/bin/pi --no-tools --mode json -p "$prompt" </dev/null
|
||||
;;
|
||||
"pi ingest "*)
|
||||
# Strict positional parse: EXACTLY `pi ingest <genome> <raw_path>` (two tokens).
|
||||
rest="${cmd#pi ingest }"
|
||||
genome="${rest%% *}"
|
||||
raw_path="${rest#* }"
|
||||
# reject: missing second token, or any extra token (a space left in raw_path)
|
||||
if [ "$genome" = "$rest" ] || [ -z "$raw_path" ] || [ "$raw_path" != "${raw_path#* }" ]; then
|
||||
echo '{"status":"error","reason":"usage: pi ingest <genome> <raw_path>"}'; exit 1
|
||||
fi
|
||||
# genome slug: kebab lowercase only
|
||||
case "$genome" in ""|*[!a-z0-9-]*) echo '{"status":"error","reason":"invalid genome name"}'; exit 1;; esac
|
||||
# raw_path whitelist: MUST live under raw/, no traversal, restricted charset.
|
||||
# - must start with "raw/" - no ".." segment - no absolute path / leading slash
|
||||
# - allowed chars: [A-Za-z0-9._/-] (kebab slugs + subdirs like raw/articles/foo.md)
|
||||
case "$raw_path" in
|
||||
raw/*) : ;;
|
||||
*) echo '{"status":"error","reason":"raw_path must be under raw/"}'; exit 1;;
|
||||
esac
|
||||
case "$raw_path" in
|
||||
*..*|*//*) echo '{"status":"error","reason":"raw_path traversal"}'; exit 1;;
|
||||
esac
|
||||
case "$raw_path" in
|
||||
*[!A-Za-z0-9._/-]*) echo '{"status":"error","reason":"raw_path illegal chars"}'; exit 1;;
|
||||
esac
|
||||
|
||||
logger -t n8n-pi-wrap "ok: pi ingest ${genome} ${raw_path}"
|
||||
set -a; . "${HOME}/.config/knowledge-genome.env"; set +a
|
||||
cd "${GENOMES_ROOT}/${genome}" || { echo '{"status":"error","reason":"unknown genome"}'; exit 1; }
|
||||
|
||||
# The raw file must actually exist under the genome's raw/ dir.
|
||||
[ -f "$raw_path" ] || { echo '{"status":"error","reason":"raw file not found"}'; exit 1; }
|
||||
|
||||
# Clean start on the configured base (develop), pinned to the remote. Destroys only
|
||||
# vm101's scratch checkout (never a shared branch, never a force-push) — this is by design.
|
||||
git fetch -q origin \
|
||||
&& git switch -q "${INGEST_BASE:-main}" 2>/dev/null \
|
||||
&& git reset -q --hard "origin/${INGEST_BASE:-main}"
|
||||
|
||||
# SEMANTIC step: dedicated script drives pi to WRITE wiki pages + manifest.
|
||||
# (NOT `pi -p "/skill:ingest ..."`, which makes the model reply in chat and write nothing.)
|
||||
log="$(mktemp -t pi-ingest.XXXXXX.log)"
|
||||
"${HOME}/.pi/agent/skills/ingest/scripts/ingest-semantic.py" "${genome}" "${raw_path}" \
|
||||
>"$log" 2>&1 \
|
||||
|| { echo "{\"status\":\"error\",\"stage\":\"semantic\",\"reason\":\"ingest-semantic failed\",\"log\":\"${log}\"}"; exit 1; }
|
||||
|
||||
# MECHANICAL step: validate manifest -> index/log/scoped-lint/commit/PR -> 1 JSON line
|
||||
exec "${HOME}/.pi/agent/skills/ingest/scripts/run-ingest.sh" "${genome}"
|
||||
;;
|
||||
"ollama list")
|
||||
logger -t n8n-pi-wrap "ok: ollama list"
|
||||
exec /usr/local/bin/ollama list
|
||||
;;
|
||||
"ollama ps")
|
||||
logger -t n8n-pi-wrap "ok: ollama ps"
|
||||
exec /usr/local/bin/ollama ps
|
||||
;;
|
||||
*)
|
||||
logger -t n8n-pi-wrap "denied: ${cmd:-<empty>}"
|
||||
echo "unauthorized command" >&2
|
||||
exit 1
|
||||
;;
|
||||
esac
|
||||
Loading…
Add table
Reference in a new issue