docs: Add comprehensive README for Knowledge Genome System

2026-05-08 21:09:42 +02:00 · 2026-05-08 21:09:42 +02:00 · 11b1245e98
commit 11b1245e98
parent 4c6f8259af
1 changed files with 200 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -1,2 +1,201 @@
-# knowledge-genome-orchestrator
+# Knowledge Genome System

+> A distributed, modular, and secure personal knowledge base architecture.
+
+The **Knowledge Genome System** is a framework designed to manage personal knowledge using a "Master-Genome" architecture. It follows the LLM-Wiki patterns (Karpathy-style) while adding a robust security layer for sensitive data and automated quality control.
+
+---
+
+# Architecture
+
+This project is structured as a **Master Orchestrator** that manages multiple independent **Genomes** via Git Submodules.
+
+## Core Components
+
+### Master Repository
+
+Contains:
+
+* Orchestration scripts
+* Global configuration (`config.env`)
+* Security templates
+
+### Genomes
+
+Individual specialized repositories (e.g. `genome-dev`, `genome-finance`) that act as standalone units of knowledge.
+
+### Security Layers
+
+#### Physical Security
+
+`git-crypt` encrypts `private/` directories at rest.
+
+#### Logical Security
+
+YAML frontmatter (`private: true`) prevents AI agents from leaking sensitive data during public sessions.
+
+#### Validation Layer
+
+A custom linting engine ensures metadata consistency.
+
+---
+
+# Quick Start
+
+## Prerequisites
+
+Required dependencies:
+
+* `git`
+* `git-crypt`
+* `curl`
+* `jq`
+
+Optional:
+
+* `bw` (Bitwarden CLI) — used for runtime key injection
+
+---
+
+## Initialization
+
+```bash
+# 1. Clone the master repository
+git clone <master-repo-url> && cd master-knowledge-genome
+
+# 2. Run the full setup
+#    (checks dependencies, creates master scaffold,
+#    initializes genomes)
+make setup
+```
+
+# Management Commands
+
+The system is controlled through a centralized Makefile.
+
+| Command           | Description                                                    |
+| ----------------- | -------------------------------------------------------------- |
+| `make setup`      | Full system initialization (Master + Registry Genomes).        |
+| `make add-genome` | Scaffolds and registers a new genome (requires NAME and DESC). |
+| `make lint`       | Runs the validation suite across all genomes.                  |
+| `make status`     | Checks Git status and encryption state for all submodules.     |
+
+# Validation & Linting (`make lint`)
+
+The built-in linter ensures that the knowledge base remains machine-readable and secure.
+
+It automatically validates:
+
+## Frontmatter Integrity
+
+Every `.md` file must contain valid YAML headers.
+
+## Domain Consistency
+
+Ensures that a file's domain metadata matches its parent genome.
+
+## Privacy Leak Detection
+
+Critical validation step.
+
+Verifies that any file located in a `/private/` directory contains the flag:
+
+```yaml
+private: true
+```
+
+This prevents accidental exposure during AI sessions.
+
+## Broken Wiki-Links
+
+Detects dead `[[internal-links]]`.
+
+# Security Model
+
+## Hybrid Privacy Architecture
+
+Each genome is divided into two layers.
+
+### Public Layer
+
+Directories:
+
+```text
+raw/public/
+wiki/public/
+```
+
+Characteristics:
+
+* Plaintext
+* Shareable with collaborators
+
+### Private Layer
+
+Directories:
+
+```text
+raw/private/
+wiki/private/
+```
+
+Characteristics:
+
+* Encrypted using AES-256 via `git-crypt`
+
+## Runtime Key Injection
+
+To keep the AI environment secure, encryption keys are never stored on the VM disk.
+
+Instead, the system uses Bitwarden (`bw`) / Vaultwarden for runtime injection.
+
+### Example
+
+```bash
+# Unlock a genome using a key stored in Vaultwarden
+git-crypt unlock <(
+  bw get notes "genome-dev key" \
+    --session "$BW_SESSION" | base64 -d
+)
+```
+
+# Genome Schema
+
+All wiki documents follow a strict schema to support AI ingestion.
+
+## YAML Frontmatter Schema
+
+```yaml
+---
+title: "Document Title"
+type: entity | concept | source | log
+domain: genome-name
+private: true/false
+last_updated: YYYY-MM-DD
+---
+```
+
+# Agent Interaction
+
+When starting a session with an AI agent, always declare the privacy context.
+
+## Public Context
+
+```text
+PRIVATE_CONTEXT: disabled
+```
+
+Behavior:
+
+* The agent ignores all private folders.
+
+## Private Context
+
+```text
+PRIVATE_CONTEXT: enabled
+```
+
+Behavior:
+
+* The agent processes encrypted data.
+* Requires the repository to be unlocked.