docs: Add comprehensive README for Knowledge Genome System
This commit is contained in:
parent
4c6f8259af
commit
11b1245e98
1 changed files with 200 additions and 1 deletions
201
README.md
201
README.md
|
|
@ -1,2 +1,201 @@
|
||||||
# knowledge-genome-orchestrator
|
# Knowledge Genome System
|
||||||
|
|
||||||
|
> A distributed, modular, and secure personal knowledge base architecture.
|
||||||
|
|
||||||
|
The **Knowledge Genome System** is a framework designed to manage personal knowledge using a "Master-Genome" architecture. It follows the LLM-Wiki patterns (Karpathy-style) while adding a robust security layer for sensitive data and automated quality control.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# Architecture
|
||||||
|
|
||||||
|
This project is structured as a **Master Orchestrator** that manages multiple independent **Genomes** via Git Submodules.
|
||||||
|
|
||||||
|
## Core Components
|
||||||
|
|
||||||
|
### Master Repository
|
||||||
|
|
||||||
|
Contains:
|
||||||
|
|
||||||
|
* Orchestration scripts
|
||||||
|
* Global configuration (`config.env`)
|
||||||
|
* Security templates
|
||||||
|
|
||||||
|
### Genomes
|
||||||
|
|
||||||
|
Individual specialized repositories (e.g. `genome-dev`, `genome-finance`) that act as standalone units of knowledge.
|
||||||
|
|
||||||
|
### Security Layers
|
||||||
|
|
||||||
|
#### Physical Security
|
||||||
|
|
||||||
|
`git-crypt` encrypts `private/` directories at rest.
|
||||||
|
|
||||||
|
#### Logical Security
|
||||||
|
|
||||||
|
YAML frontmatter (`private: true`) prevents AI agents from leaking sensitive data during public sessions.
|
||||||
|
|
||||||
|
#### Validation Layer
|
||||||
|
|
||||||
|
A custom linting engine ensures metadata consistency.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
# Quick Start
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
Required dependencies:
|
||||||
|
|
||||||
|
* `git`
|
||||||
|
* `git-crypt`
|
||||||
|
* `curl`
|
||||||
|
* `jq`
|
||||||
|
|
||||||
|
Optional:
|
||||||
|
|
||||||
|
* `bw` (Bitwarden CLI) — used for runtime key injection
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Initialization
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# 1. Clone the master repository
|
||||||
|
git clone <master-repo-url> && cd master-knowledge-genome
|
||||||
|
|
||||||
|
# 2. Run the full setup
|
||||||
|
# (checks dependencies, creates master scaffold,
|
||||||
|
# initializes genomes)
|
||||||
|
make setup
|
||||||
|
```
|
||||||
|
|
||||||
|
# Management Commands
|
||||||
|
|
||||||
|
The system is controlled through a centralized Makefile.
|
||||||
|
|
||||||
|
| Command | Description |
|
||||||
|
| ----------------- | -------------------------------------------------------------- |
|
||||||
|
| `make setup` | Full system initialization (Master + Registry Genomes). |
|
||||||
|
| `make add-genome` | Scaffolds and registers a new genome (requires NAME and DESC). |
|
||||||
|
| `make lint` | Runs the validation suite across all genomes. |
|
||||||
|
| `make status` | Checks Git status and encryption state for all submodules. |
|
||||||
|
|
||||||
|
# Validation & Linting (`make lint`)
|
||||||
|
|
||||||
|
The built-in linter ensures that the knowledge base remains machine-readable and secure.
|
||||||
|
|
||||||
|
It automatically validates:
|
||||||
|
|
||||||
|
## Frontmatter Integrity
|
||||||
|
|
||||||
|
Every `.md` file must contain valid YAML headers.
|
||||||
|
|
||||||
|
## Domain Consistency
|
||||||
|
|
||||||
|
Ensures that a file's domain metadata matches its parent genome.
|
||||||
|
|
||||||
|
## Privacy Leak Detection
|
||||||
|
|
||||||
|
Critical validation step.
|
||||||
|
|
||||||
|
Verifies that any file located in a `/private/` directory contains the flag:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
private: true
|
||||||
|
```
|
||||||
|
|
||||||
|
This prevents accidental exposure during AI sessions.
|
||||||
|
|
||||||
|
## Broken Wiki-Links
|
||||||
|
|
||||||
|
Detects dead `[[internal-links]]`.
|
||||||
|
|
||||||
|
# Security Model
|
||||||
|
|
||||||
|
## Hybrid Privacy Architecture
|
||||||
|
|
||||||
|
Each genome is divided into two layers.
|
||||||
|
|
||||||
|
### Public Layer
|
||||||
|
|
||||||
|
Directories:
|
||||||
|
|
||||||
|
```text
|
||||||
|
raw/public/
|
||||||
|
wiki/public/
|
||||||
|
```
|
||||||
|
|
||||||
|
Characteristics:
|
||||||
|
|
||||||
|
* Plaintext
|
||||||
|
* Shareable with collaborators
|
||||||
|
|
||||||
|
### Private Layer
|
||||||
|
|
||||||
|
Directories:
|
||||||
|
|
||||||
|
```text
|
||||||
|
raw/private/
|
||||||
|
wiki/private/
|
||||||
|
```
|
||||||
|
|
||||||
|
Characteristics:
|
||||||
|
|
||||||
|
* Encrypted using AES-256 via `git-crypt`
|
||||||
|
|
||||||
|
## Runtime Key Injection
|
||||||
|
|
||||||
|
To keep the AI environment secure, encryption keys are never stored on the VM disk.
|
||||||
|
|
||||||
|
Instead, the system uses Bitwarden (`bw`) / Vaultwarden for runtime injection.
|
||||||
|
|
||||||
|
### Example
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Unlock a genome using a key stored in Vaultwarden
|
||||||
|
git-crypt unlock <(
|
||||||
|
bw get notes "genome-dev key" \
|
||||||
|
--session "$BW_SESSION" | base64 -d
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
# Genome Schema
|
||||||
|
|
||||||
|
All wiki documents follow a strict schema to support AI ingestion.
|
||||||
|
|
||||||
|
## YAML Frontmatter Schema
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
---
|
||||||
|
title: "Document Title"
|
||||||
|
type: entity | concept | source | log
|
||||||
|
domain: genome-name
|
||||||
|
private: true/false
|
||||||
|
last_updated: YYYY-MM-DD
|
||||||
|
---
|
||||||
|
```
|
||||||
|
|
||||||
|
# Agent Interaction
|
||||||
|
|
||||||
|
When starting a session with an AI agent, always declare the privacy context.
|
||||||
|
|
||||||
|
## Public Context
|
||||||
|
|
||||||
|
```text
|
||||||
|
PRIVATE_CONTEXT: disabled
|
||||||
|
```
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
|
||||||
|
* The agent ignores all private folders.
|
||||||
|
|
||||||
|
## Private Context
|
||||||
|
|
||||||
|
```text
|
||||||
|
PRIVATE_CONTEXT: enabled
|
||||||
|
```
|
||||||
|
|
||||||
|
Behavior:
|
||||||
|
|
||||||
|
* The agent processes encrypted data.
|
||||||
|
* Requires the repository to be unlocked.
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue