knowledge-genome-orchestrator/skills/ingest/SKILL.md

3.8 KiB

name description license compatibility allowed-tools metadata
ingest Semantic pass of a single raw source into the current genome's wiki — read the source, write sources/entities/concepts, handle contradictions, then emit a manifest and STOP. Use when a new file lands in raw/. Does NOT do git, log, index, lint, or PRs (a post-processor handles those), and does NOT handle private sources or project repos. see repository Runs inside one genome checkout (cwd = genome root). Tools needed — read, edit only. NO bash, NO git. The deterministic steps (index, log, scoped lint, PR) run AFTER you exit, via run-ingest.sh. PRIVATE_CONTEXT must be disabled. read edit
framework phase
knowledge-genome 1-ingest-semantic

Ingest — semantic pass

You run inside ONE genome checkout. AGENTS.md (already in your context) is the authoritative contract. Your job is the semantic pass only: read the source, write the wiki pages, handle contradictions. You do not touch git, the log, the index, the linter, or PRs — a post-processor (run-ingest.sh) does all of that after you stop, from the manifest you leave behind. This keeps your context clean and your turns few, which matters on a small local model.

Argument: the relative path of the single raw source to ingest (e.g. raw/articles/foo.md). Process only this one.

Pre-flight — stop the session if any check fails

  1. Refuse if the argument path is under any private/ directory.
  2. Refuse if PRIVATE_CONTEXT is not disabled.
  3. Confirm the file exists under raw/.

Semantic work (your only job)

  1. Read the source once.
  2. Write wiki/sources/<kebab-slug>.md — faithful summary + key points, with the required frontmatter (type: source, domain: <genome>, maturity: draft, last_updated: <today>, private: false, sensible tags).
  3. For each entity (person, tool, org) → create or update wiki/entities/<kebab-name>.md.
  4. For each concept (pattern, theory, decision) → create or update wiki/concepts/<kebab-name>.md.
  5. On a real contradiction with an existing claim, follow AGENTS.md §Conflict: create wiki/queries/conflict-<concept>-<YYYY-MM-DD>.md. Never overwrite the existing page.

Name files in kebab-case and pick stable names. Read wiki/index.md (and the specific pages it points to) to decide create-vs-update and to spot contradictions. Do not scan whole directories.

Finish: write the manifest, then STOP

As your final action, write .ingest-manifest.json at the genome root (NOT under wiki/) describing exactly what you did. Then stop — do not commit, lint, append to the log/index, or open anything.

{
  "raw_source": "raw/articles/foo.md",
  "reasoning": "One sentence for the log: what changed and why.",
  "pr_summary": "One or two sentences describing this ingest for the PR.",
  "contradictions": "None   (or: 1 conflict file created — <concept>)",
  "pages": [
    {
      "path": "wiki/sources/foo.md",
      "summary": "One-line index summary.",
      "maturity": "draft",
      "status": "created"
    },
    {
      "path": "wiki/entities/acme.md",
      "summary": "Acme — vendor.",
      "status": "modified"
    }
  ]
}

Manifest rules:

  • List every page you created or modified, with status created or modified.
  • summary is the one-line index description (≈12 words max). For conflict pages the summary is ignored — the index lists conflicts by slug only.
  • maturity is required only on created pages (it seeds the new index entry). It is ignored for modified pages, so omit it there.
  • Do NOT add a model field — the orchestrator records which model produced this run; you cannot know your own model name reliably, so do not guess one.
  • Do not invent a run_id, branch, commit, or PR — those belong to the post-processor.

One source per session. After writing the manifest, stop.