Merge branch 'release/1.1.5' into main
This commit is contained in:
commit
2f3173e178
8 changed files with 96 additions and 35 deletions
2
Makefile
2
Makefile
|
|
@ -1,5 +1,5 @@
|
|||
# =============================================================================
|
||||
# Knowledge Genome - Makefile v. 1.1.4
|
||||
# Knowledge Genome - Makefile v. 1.1.5
|
||||
# Orchestrates the setup and management of the knowledge base.
|
||||
# =============================================================================
|
||||
|
||||
|
|
|
|||
37
README.md
37
README.md
|
|
@ -190,8 +190,9 @@ All tools (git-crypt, bw, qmd) have native Linux binaries.
|
|||
All scripts are compatible with macOS. Requirements:
|
||||
|
||||
- bash 3.2+ (macOS default) — supported for the **setup scripts** (`make` targets, scaffolding).
|
||||
The `ingest` skill uses bash 4+ constructs (`mapfile`), but it is deployed and run on the
|
||||
Linux AI node, not on the macOS setup machine — so this is not a constraint in practice.
|
||||
Two things need bash 4+: the `ingest` skill (`mapfile`), which runs on the Linux AI node (not a
|
||||
constraint on the macOS setup machine); and `gcrypt_rotate_key` (`compgen -G`), which **does**
|
||||
run on the laptop. For key rotation on macOS, use Homebrew bash (`brew install bash`).
|
||||
- GNU coreutils not required — BSD variants of `date`, `grep`, `sed` all handled.
|
||||
- `git-crypt`: install via Homebrew — `brew install git-crypt`
|
||||
- `jq`, `curl`: pre-installed or via Homebrew
|
||||
|
|
@ -695,6 +696,9 @@ cd ~/knowledge-genome-orchestrator/genome-dev
|
|||
gcrypt_rotate_key "genome-dev"
|
||||
```
|
||||
|
||||
> **macOS:** `gcrypt_rotate_key` uses `compgen -G` (bash 4+). The stock macOS bash 3.2 is not
|
||||
> enough — run rotation under Homebrew bash (`brew install bash`).
|
||||
|
||||
`gcrypt_rotate_key` performs:
|
||||
|
||||
1. Unlocks repo with existing key
|
||||
|
|
@ -951,18 +955,25 @@ Pages have a `last_updated` field in frontmatter. During lint passes:
|
|||
|
||||
The agent proposes re-validation but does not change `maturity` without new source evidence.
|
||||
|
||||
### Cross-genome lint
|
||||
### Cross-genome references
|
||||
|
||||
A manual, monthly operation. Not automated in CI/CD — the token cost and coordination
|
||||
complexity are not justified at this scale.
|
||||
Cross-domain knowledge moves by **pull, never push**: the genome you are working in draws
|
||||
material _in_; nothing is ever written into another genome. There are **no cross-genome
|
||||
wikilinks** — submodule pointers make relative paths brittle.
|
||||
|
||||
1. Operator initiates a master-repo agent session
|
||||
2. Agent uses `qmd search "<concept>"` across the multi-genome index to find:
|
||||
- Concepts defined in 2+ genomes with potentially conflicting definitions
|
||||
- Entities referenced cross-genome without canonical cross-genome wikilinks
|
||||
- Concepts in genome-X that should link to genome-Y
|
||||
3. Agent reports findings — does not modify files
|
||||
4. For each finding: create conflict note in the genome where resolution belongs
|
||||
When the working genome needs a concept that lives elsewhere, the **navigation skill** handles
|
||||
it in the same two-phase shape as ingest:
|
||||
|
||||
1. A deterministic collector clones the relevant genomes **read-only at HEAD** (fresh — never the
|
||||
pinned submodule state) and assembles a dossier of excerpts with provenance.
|
||||
2. A semantic pass reads only that dossier; the skill then deposits **one** abstract, non-private
|
||||
raw into the working genome at `raw/articles/crossgen-<topic>-<date>.md`.
|
||||
3. That raw goes through the working genome's normal ingest → PR → human gate, like any source.
|
||||
|
||||
Which genomes may be read as **sources** is gated by a per-genome `cross_source: yes|no` flag: a
|
||||
confidential genome (e.g. a client file) is marked `no` and is never read as a source — the wall
|
||||
is structural, not a matter of the agent's discipline. The master `AGENTS.md` holds the full
|
||||
boundary contract.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -1021,7 +1032,7 @@ and keep the wiki atomically navigable.
|
|||
### Linking conventions
|
||||
|
||||
- **Intra-genome:** `[[folder/file]]` — Obsidian wikilinks only.
|
||||
- **Cross-genome:** NOT supported via wikilink. Submodule pointers make relative paths brittle. When a concept belongs to another genome, use the navigation skill to emit a raw stub into that genome's `raw/articles/` directory so its local ingest pipeline can process it.
|
||||
- **Cross-genome:** NOT supported via wikilink — submodule pointers make relative paths brittle. When the working genome needs a concept that lives elsewhere, the navigation skill **pulls it in** as one abstract raw under _this_ genome's `raw/articles/`, which then goes through normal ingest. See [Cross-genome references](#cross-genome-references).
|
||||
- **External:** `[text](https://...)` — standard Markdown.
|
||||
|
||||
### Log format
|
||||
|
|
|
|||
|
|
@ -21,18 +21,29 @@ gcrypt_export_key() {
|
|||
|
||||
gcrypt_verify() {
|
||||
local genome_name="$1"
|
||||
local key_path="${KEYS_DIR}/${genome_name}.key"
|
||||
|
||||
info "Verifying git-crypt status for ${genome_name}..."
|
||||
git-crypt lock
|
||||
info "Verifying git-crypt configuration for ${genome_name}..."
|
||||
|
||||
if file "raw/private/.gitkeep" 2>/dev/null | grep -q "data"; then
|
||||
success "Encryption verified: private/ directory is protected."
|
||||
# `git-crypt status` reports the CONFIGURED status (from `.gitattributes`), not the
|
||||
# lock/unlock status of the working tree. Encrypted lines have their labels right-aligned
|
||||
# (with leading whitespace), so you CANNOT anchor on `^encrypted`.
|
||||
# We filter by private/ and distinguish “encrypted” from “not encrypted” without
|
||||
# relying on exact spacing.
|
||||
local status_out encrypted_count not_encrypted_count
|
||||
status_out=$(git-crypt status 2>/dev/null || true)
|
||||
encrypted_count=$(printf '%s\n' "$status_out" | grep 'private/' | grep -cE '^[[:space:]]*encrypted:' || true)
|
||||
not_encrypted_count=$(printf '%s\n' "$status_out" | grep 'private/' | grep -cE '^not encrypted:' || true)
|
||||
|
||||
if [[ "$encrypted_count" -gt 0 ]]; then
|
||||
success "Encryption configured: ${encrypted_count} private file(s) under git-crypt."
|
||||
if [[ "$not_encrypted_count" -gt 0 ]]; then
|
||||
warn "${not_encrypted_count} file(s) under private/ are NOT covered by the git-crypt filter — check .gitattributes (leak risk)."
|
||||
fi
|
||||
elif [[ "$not_encrypted_count" -gt 0 ]]; then
|
||||
warn "private/ files exist but none are covered by the git-crypt filter — check the .gitattributes filter (leak risk)."
|
||||
else
|
||||
warn "Encryption check inconclusive. Run 'git-crypt status' manually."
|
||||
info "No private/ files present yet — nothing to verify."
|
||||
fi
|
||||
|
||||
[[ -f "$key_path" ]] && git-crypt unlock "$key_path"
|
||||
}
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
|
|
@ -107,6 +118,8 @@ gcrypt_rotate_key() {
|
|||
|
||||
# 5. Re-stage private files so they are committed encrypted with the new key
|
||||
local staged=0
|
||||
# compgen -G requires bash 4+ for reliable glob expansion. macOS stock
|
||||
# bash is 3.2; use Homebrew bash (already recommended in README) for rotation.
|
||||
if compgen -G "raw/private/*" > /dev/null 2>&1; then
|
||||
git add raw/private/
|
||||
staged=1
|
||||
|
|
|
|||
27
lib/lint.sh
27
lib/lint.sh
|
|
@ -23,7 +23,7 @@ lint_markdown_file() {
|
|||
|
||||
# 1. Check frontmatter delimiters
|
||||
if [[ $(head -n 1 "$file") != "---" ]]; then
|
||||
warn "Missing frontmatter start (---) in: $file"
|
||||
error "Missing frontmatter start (---) in: $file"
|
||||
errors=$((errors + 1))
|
||||
fi
|
||||
|
||||
|
|
@ -31,14 +31,14 @@ lint_markdown_file() {
|
|||
local mandatory_fields=("title:" "type:" "domain:" "maturity:" "last_updated:")
|
||||
for field in "${mandatory_fields[@]}"; do
|
||||
if ! grep -q "^${field}" "$file"; then
|
||||
warn "Missing mandatory field '${field}' in: $file"
|
||||
error "Missing mandatory field '${field}' in: $file"
|
||||
errors=$((errors + 1))
|
||||
fi
|
||||
done
|
||||
|
||||
# 3. Check domain matches genome name
|
||||
if grep -q "^domain:" "$file" && ! grep -q "^domain: ${genome_name}" "$file"; then
|
||||
warn "Domain mismatch in $file (expected '${genome_name}')"
|
||||
error "Domain mismatch in $file (expected '${genome_name}')"
|
||||
errors=$((errors + 1))
|
||||
fi
|
||||
|
||||
|
|
@ -70,8 +70,8 @@ check_valid_type() {
|
|||
done
|
||||
|
||||
if [[ $valid -eq 0 ]]; then
|
||||
warn "Invalid type value '${type_value}' in: $file"
|
||||
warn " Valid types: ${VALID_TYPES[*]}"
|
||||
error "Invalid type value '${type_value}' in: $file"
|
||||
error " Valid types: ${VALID_TYPES[*]}"
|
||||
return 1
|
||||
fi
|
||||
|
||||
|
|
@ -144,8 +144,8 @@ check_knowledge_decay() {
|
|||
esac
|
||||
|
||||
if [[ $days_old -gt $threshold ]]; then
|
||||
warn "STALE: $file"
|
||||
warn " maturity: ${maturity} | last_updated: ${last_updated} | ${days_old} days ago (threshold: ${threshold})"
|
||||
error "STALE: $file"
|
||||
error " maturity: ${maturity} | last_updated: ${last_updated} | ${days_old} days ago (threshold: ${threshold})"
|
||||
return 1
|
||||
fi
|
||||
|
||||
|
|
@ -190,12 +190,21 @@ check_broken_links() {
|
|||
local links
|
||||
links=$(grep -oE '\[\[[^\]]+' "$file" 2>/dev/null | sed 's/^\[\[//' | cut -d'|' -f1)
|
||||
|
||||
for link in $links; do
|
||||
# Cross-genome links (../other-genome/…) are not resolvable from a single
|
||||
# genome checkout and are skipped — they would always fall
|
||||
# through the two-level lookup and produce non-actionable warnings.
|
||||
while IFS= read -r link; do
|
||||
[[ -z "$link" ]] && continue
|
||||
|
||||
if [[ "$link" == ../* ]]; then
|
||||
continue
|
||||
fi
|
||||
|
||||
local target="$link"
|
||||
[[ "$target" != *.md ]] && target="${target}.md"
|
||||
|
||||
if [[ ! -f "${base_dir}/${target}" && ! -f "${base_dir}/../${target}" ]]; then
|
||||
warn "Potential broken link: [[$link]] in $file"
|
||||
fi
|
||||
done
|
||||
done <<< "$links"
|
||||
}
|
||||
|
|
|
|||
|
|
@ -9,6 +9,14 @@
|
|||
# structure check can never drift apart.
|
||||
# =============================================================================
|
||||
|
||||
# NOTE — Return-code smell
|
||||
# Several functions in this file (and in lint.sh) use the return code as a
|
||||
# numeric counter (e.g. return $missing). This is a known smell: exit codes
|
||||
# wrap at 256 and conflate "count of problems" with "exit status". At the
|
||||
# current scale (<10 problems per run) the wrap-around risk is zero, so we
|
||||
# accept it pragmatically. If counts ever grow, switch to stdout counters
|
||||
# or dedicated global variables.
|
||||
|
||||
# Canonical directories every genome must have.
|
||||
# raw/* are input buckets (collaborator-writable); wiki/* is the agent-owned,
|
||||
# contract-bound layout the lint, the index sections and the ingest skill depend on.
|
||||
|
|
@ -43,6 +51,7 @@ structure_report() {
|
|||
info "extra (not in canon): ${d}"
|
||||
done < <(find "${base}/raw" "${base}/wiki" -mindepth 1 -type d 2>/dev/null)
|
||||
|
||||
# NOTE: return $missing is a smell — see header. Kept for compatibility.
|
||||
return $missing
|
||||
}
|
||||
|
||||
|
|
|
|||
|
|
@ -24,6 +24,15 @@ step "Adding New Genome: ${GENOME_NAME}"
|
|||
# Build a 3-field registry entry (linked_repo may be empty)
|
||||
GENOMES=("${GENOME_NAME}|${GENOME_DESC}|${GENOME_LINKED}")
|
||||
|
||||
# NOTE — Maintenance smell
|
||||
# We source setup-genomes.sh as a library/orchestrator hybrid. This works because:
|
||||
# - registry.sh is guarded against double-source (idempotent guard)
|
||||
# - setup-genomes.sh checks WORK_DIR before re-sourcing registry.sh
|
||||
# - GENOMES is built locally just before the source, so it is not clobbered
|
||||
# However, sourcing an orchestration script as a library makes the control flow
|
||||
# harder to trace. If this grows, refactor into a shared function (e.g. setup_one_genome)
|
||||
# called by both add-genome.sh and setup-genomes.sh.
|
||||
|
||||
source "scripts/setup-genomes.sh"
|
||||
|
||||
success "Genome '${GENOME_NAME}' added and linked successfully!"
|
||||
|
|
|
|||
|
|
@ -59,10 +59,14 @@ all_paths=( "${created_paths[@]}" "${modified_paths[@]}" )
|
|||
|
||||
conflict_label=""
|
||||
|
||||
# NOTE: no rollback. Steps below mutate the working tree in order (index → log → commit).
|
||||
# All are idempotent on re-run EXCEPT log-append (append-only). If a step fails midway,
|
||||
# nothing is committed (open-pr is the only committer) — the operator re-runs, or inspects
|
||||
# wiki/ if log-append already wrote a line. The manifest is removed only on full success.
|
||||
# NOTE: No rollback. The steps below modify the working tree in order (index → log → commit).
|
||||
# All steps are idempotent on re-run EXCEPT log-append (append-only). If a step fails midway,
|
||||
# nothing is committed (open-pr is the only committer) — the operator re-runs, or checks
|
||||
# wiki/ if log-append has already written a line. The manifest is removed only upon full success.
|
||||
# log-append is not idempotent: a re-run after a post-log failure produces
|
||||
# duplicate lines. This is accepted by design (append-only ledger, no rollback). If this
|
||||
# becomes a nuisance tomorrow, add a dedup check on run_id in log-append.sh
|
||||
# (grep for run_id before appending). Manual recovery: grep for run_id in wiki/log.md.
|
||||
|
||||
# --- 1. index entries (created pages only), inserted in order ---
|
||||
while IFS=$'\t' read -r path summary maturity; do
|
||||
|
|
@ -76,6 +80,7 @@ while IFS=$'\t' read -r path summary maturity; do
|
|||
queries)
|
||||
if [[ "$link" == queries/conflict-* ]]; then section="Conflicts"; conflict_label="CONFLICT"
|
||||
else section="Queries"; fi ;;
|
||||
# private/ is not routed here — ingest is public-only. Add when private ingest is built.
|
||||
*) section="Sources" ;;
|
||||
esac
|
||||
|
||||
|
|
|
|||
|
|
@ -54,6 +54,11 @@ private: false
|
|||
|
||||
## Conflicts Pending Review (`wiki/queries/conflict-*.md`)
|
||||
*slugs only.*
|
||||
|
||||
|
||||
## Private Synthesis (`wiki/private/`)
|
||||
*Restricted access. Requires PRIVATE_CONTEXT: enabled and unlocked repo.*
|
||||
*List slug names ONLY. Do not append summaries — prevents metadata leakage.*
|
||||
EOF
|
||||
|
||||
cat > "${g}/wiki/log.md" <<'EOF'
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue