docs: Add comprehensive README for Knowledge Genome System
This commit is contained in:
parent
4c6f8259af
commit
11b1245e98
1 changed files with 200 additions and 1 deletions
201
README.md
201
README.md
|
|
@ -1,2 +1,201 @@
|
|||
# knowledge-genome-orchestrator
|
||||
# Knowledge Genome System
|
||||
|
||||
> A distributed, modular, and secure personal knowledge base architecture.
|
||||
|
||||
The **Knowledge Genome System** is a framework designed to manage personal knowledge using a "Master-Genome" architecture. It follows the LLM-Wiki patterns (Karpathy-style) while adding a robust security layer for sensitive data and automated quality control.
|
||||
|
||||
---
|
||||
|
||||
# Architecture
|
||||
|
||||
This project is structured as a **Master Orchestrator** that manages multiple independent **Genomes** via Git Submodules.
|
||||
|
||||
## Core Components
|
||||
|
||||
### Master Repository
|
||||
|
||||
Contains:
|
||||
|
||||
* Orchestration scripts
|
||||
* Global configuration (`config.env`)
|
||||
* Security templates
|
||||
|
||||
### Genomes
|
||||
|
||||
Individual specialized repositories (e.g. `genome-dev`, `genome-finance`) that act as standalone units of knowledge.
|
||||
|
||||
### Security Layers
|
||||
|
||||
#### Physical Security
|
||||
|
||||
`git-crypt` encrypts `private/` directories at rest.
|
||||
|
||||
#### Logical Security
|
||||
|
||||
YAML frontmatter (`private: true`) prevents AI agents from leaking sensitive data during public sessions.
|
||||
|
||||
#### Validation Layer
|
||||
|
||||
A custom linting engine ensures metadata consistency.
|
||||
|
||||
---
|
||||
|
||||
# Quick Start
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Required dependencies:
|
||||
|
||||
* `git`
|
||||
* `git-crypt`
|
||||
* `curl`
|
||||
* `jq`
|
||||
|
||||
Optional:
|
||||
|
||||
* `bw` (Bitwarden CLI) — used for runtime key injection
|
||||
|
||||
---
|
||||
|
||||
## Initialization
|
||||
|
||||
```bash
|
||||
# 1. Clone the master repository
|
||||
git clone <master-repo-url> && cd master-knowledge-genome
|
||||
|
||||
# 2. Run the full setup
|
||||
# (checks dependencies, creates master scaffold,
|
||||
# initializes genomes)
|
||||
make setup
|
||||
```
|
||||
|
||||
# Management Commands
|
||||
|
||||
The system is controlled through a centralized Makefile.
|
||||
|
||||
| Command | Description |
|
||||
| ----------------- | -------------------------------------------------------------- |
|
||||
| `make setup` | Full system initialization (Master + Registry Genomes). |
|
||||
| `make add-genome` | Scaffolds and registers a new genome (requires NAME and DESC). |
|
||||
| `make lint` | Runs the validation suite across all genomes. |
|
||||
| `make status` | Checks Git status and encryption state for all submodules. |
|
||||
|
||||
# Validation & Linting (`make lint`)
|
||||
|
||||
The built-in linter ensures that the knowledge base remains machine-readable and secure.
|
||||
|
||||
It automatically validates:
|
||||
|
||||
## Frontmatter Integrity
|
||||
|
||||
Every `.md` file must contain valid YAML headers.
|
||||
|
||||
## Domain Consistency
|
||||
|
||||
Ensures that a file's domain metadata matches its parent genome.
|
||||
|
||||
## Privacy Leak Detection
|
||||
|
||||
Critical validation step.
|
||||
|
||||
Verifies that any file located in a `/private/` directory contains the flag:
|
||||
|
||||
```yaml
|
||||
private: true
|
||||
```
|
||||
|
||||
This prevents accidental exposure during AI sessions.
|
||||
|
||||
## Broken Wiki-Links
|
||||
|
||||
Detects dead `[[internal-links]]`.
|
||||
|
||||
# Security Model
|
||||
|
||||
## Hybrid Privacy Architecture
|
||||
|
||||
Each genome is divided into two layers.
|
||||
|
||||
### Public Layer
|
||||
|
||||
Directories:
|
||||
|
||||
```text
|
||||
raw/public/
|
||||
wiki/public/
|
||||
```
|
||||
|
||||
Characteristics:
|
||||
|
||||
* Plaintext
|
||||
* Shareable with collaborators
|
||||
|
||||
### Private Layer
|
||||
|
||||
Directories:
|
||||
|
||||
```text
|
||||
raw/private/
|
||||
wiki/private/
|
||||
```
|
||||
|
||||
Characteristics:
|
||||
|
||||
* Encrypted using AES-256 via `git-crypt`
|
||||
|
||||
## Runtime Key Injection
|
||||
|
||||
To keep the AI environment secure, encryption keys are never stored on the VM disk.
|
||||
|
||||
Instead, the system uses Bitwarden (`bw`) / Vaultwarden for runtime injection.
|
||||
|
||||
### Example
|
||||
|
||||
```bash
|
||||
# Unlock a genome using a key stored in Vaultwarden
|
||||
git-crypt unlock <(
|
||||
bw get notes "genome-dev key" \
|
||||
--session "$BW_SESSION" | base64 -d
|
||||
)
|
||||
```
|
||||
|
||||
# Genome Schema
|
||||
|
||||
All wiki documents follow a strict schema to support AI ingestion.
|
||||
|
||||
## YAML Frontmatter Schema
|
||||
|
||||
```yaml
|
||||
---
|
||||
title: "Document Title"
|
||||
type: entity | concept | source | log
|
||||
domain: genome-name
|
||||
private: true/false
|
||||
last_updated: YYYY-MM-DD
|
||||
---
|
||||
```
|
||||
|
||||
# Agent Interaction
|
||||
|
||||
When starting a session with an AI agent, always declare the privacy context.
|
||||
|
||||
## Public Context
|
||||
|
||||
```text
|
||||
PRIVATE_CONTEXT: disabled
|
||||
```
|
||||
|
||||
Behavior:
|
||||
|
||||
* The agent ignores all private folders.
|
||||
|
||||
## Private Context
|
||||
|
||||
```text
|
||||
PRIVATE_CONTEXT: enabled
|
||||
```
|
||||
|
||||
Behavior:
|
||||
|
||||
* The agent processes encrypted data.
|
||||
* Requires the repository to be unlocked.
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue