No description
Find a file
2026-05-08 21:11:38 +02:00
lib feat: Develop comprehensive linting logic for genome quality and security 2026-05-08 21:10:12 +02:00
providers feat: Implement repository provider interfaces for GitHub and Forgejo 2026-05-08 21:10:12 +02:00
scripts feat: Add main setup script and Makefile for unified control 2026-05-08 21:10:12 +02:00
templates feat: Create core templates for AI agent directives and wiki structure 2026-05-08 21:10:12 +02:00
config.env feat: Introduce global configuration file config.env 2026-05-08 21:10:12 +02:00
Makefile Auto stash before rebase of "develop" 2026-05-08 21:11:24 +02:00
README.md docs: Add comprehensive README for Knowledge Genome System 2026-05-08 21:10:12 +02:00

Knowledge Genome System

A distributed, modular, and secure personal knowledge base architecture.

The Knowledge Genome System is a framework designed to manage personal knowledge using a "Master-Genome" architecture. It follows the LLM-Wiki patterns (Karpathy-style) while adding a robust security layer for sensitive data and automated quality control.


Architecture

This project is structured as a Master Orchestrator that manages multiple independent Genomes via Git Submodules.

Core Components

Master Repository

Contains:

  • Orchestration scripts
  • Global configuration (config.env)
  • Security templates

Genomes

Individual specialized repositories (e.g. genome-dev, genome-finance) that act as standalone units of knowledge.

Security Layers

Physical Security

git-crypt encrypts private/ directories at rest.

Logical Security

YAML frontmatter (private: true) prevents AI agents from leaking sensitive data during public sessions.

Validation Layer

A custom linting engine ensures metadata consistency.


Quick Start

Prerequisites

Required dependencies:

  • git
  • git-crypt
  • curl
  • jq

Optional:

  • bw (Bitwarden CLI) — used for runtime key injection

Initialization

# 1. Clone the master repository
git clone <master-repo-url> && cd master-knowledge-genome

# 2. Run the full setup
#    (checks dependencies, creates master scaffold,
#    initializes genomes)
make setup

Management Commands

The system is controlled through a centralized Makefile.

Command Description
make setup Full system initialization (Master + Registry Genomes).
make add-genome Scaffolds and registers a new genome (requires NAME and DESC).
make lint Runs the validation suite across all genomes.
make status Checks Git status and encryption state for all submodules.

Validation & Linting (make lint)

The built-in linter ensures that the knowledge base remains machine-readable and secure.

It automatically validates:

Frontmatter Integrity

Every .md file must contain valid YAML headers.

Domain Consistency

Ensures that a file's domain metadata matches its parent genome.

Privacy Leak Detection

Critical validation step.

Verifies that any file located in a /private/ directory contains the flag:

private: true

This prevents accidental exposure during AI sessions.

Detects dead [[internal-links]].

Security Model

Hybrid Privacy Architecture

Each genome is divided into two layers.

Public Layer

Directories:

raw/public/
wiki/public/

Characteristics:

  • Plaintext
  • Shareable with collaborators

Private Layer

Directories:

raw/private/
wiki/private/

Characteristics:

  • Encrypted using AES-256 via git-crypt

Runtime Key Injection

To keep the AI environment secure, encryption keys are never stored on the VM disk.

Instead, the system uses Bitwarden (bw) / Vaultwarden for runtime injection.

Example

# Unlock a genome using a key stored in Vaultwarden
git-crypt unlock <(
  bw get notes "genome-dev key" \
    --session "$BW_SESSION" | base64 -d
)

Genome Schema

All wiki documents follow a strict schema to support AI ingestion.

YAML Frontmatter Schema

---
title: "Document Title"
type: entity | concept | source | log
domain: genome-name
private: true/false
last_updated: YYYY-MM-DD
---

Agent Interaction

When starting a session with an AI agent, always declare the privacy context.

Public Context

PRIVATE_CONTEXT: disabled

Behavior:

  • The agent ignores all private folders.

Private Context

PRIVATE_CONTEXT: enabled

Behavior:

  • The agent processes encrypted data.
  • Requires the repository to be unlocked.