Skip to content

Architecture

This page explains how Kundun-Agent is put together: the layered module structure, the single composition root that wires everything up, the SQLite storage model, and the on-disk layout under .kundun/. It is intentionally conceptual — for code-level recipes (“how do I add a new language extractor?”, “how do I add a CLI command?”) see ../../CLAUDE.md.

Kundun-Agent is local-first: it indexes a codebase, stores persistent memory, tracks tasks, runs cleanup, and serves context — all locally, backed by a single SQLite database. No project content is sent to external APIs by default.

This page covers MVP 1 (the local core). The MCP server, diagnostics, daemon, sessions, health/metrics, local HTTP API, and desktop app are not part of MVP 1 and are not described here.

The codebase is organized as a strict dependency stack. Each layer may depend only on the layers below it; nothing reaches back up.

cli CLI commands and argument parsing (the kundun binary)
languages per-language regex symbol extractors (php, go, ts, js, csharp, cpp, sql)
core engines: scanner, indexer, search, memory, task, cleanup, summary
storage SQLite connection, migrations, repositories, the data model
config kundun.config.json loading + zod schema + defaults
utils hashing, time, path-safety, binary detection, logging, errors, json

The flow of control runs top-down. A CLI command builds an application context, asks a core engine to do work, and the engine talks to the database through storage repositories. Cross-cutting helpers (hashing a file, checking a path is inside the project root, formatting a timestamp) live in utils and are used everywhere.

Keeping the dependencies one-directional means each engine can be reasoned about — and tested — in isolation, with the database (or in-memory database) behind a narrow repository interface.

All wiring happens in one place: src/core/container.ts. CLI commands never construct repositories or engines by hand; they call the container, which returns a fully assembled AppContext.

createAppContext({ projectRoot }) performs a fixed sequence:

  1. Load and resolve kundun.config.json (throws a not_initialized error if the project has no config or no database yet, so the CLI can suggest kundun init).
  2. Open the SQLite database and apply connection PRAGMAs.
  3. Run any pending migrations.
  4. Mirror the authoritative schema version into the project_meta row.
  5. Build the logger.
  6. Construct every repository.

The resulting AppContext bundles the loaded config, the resolved projectRoot and kundunDir paths, the open database handle (kdb), a logger, all repos, and a close() method that releases the database (and checkpoints the WAL).

On top of the context, thin build* factories assemble each engine on demand:

buildScanner(ctx) -> ProjectScanner
buildIndexer(ctx) -> Indexer
buildSearchProvider(ctx) -> SearchProvider (FTS5 vs LIKE chosen from kdb.hasFts5)
buildMemoryEngine(ctx) -> MemoryEngine
buildTaskEngine(ctx) -> TaskEngine
buildCleanupEngine(ctx) -> CleanupEngine

Two facts shape the wiring:

  • better-sqlite3 is fully synchronous. Nothing in the container or the engines is async — there is no event loop juggling around the database.
  • FTS5 availability is detected once when the database is opened and exposed as kdb.hasFts5. The search provider and the memory engine read that flag to decide between the FTS5 path and the LIKE fallback, so the choice is made in exactly one place.

The database is a single SQLite file (default .kundun/kundun.sqlite). It is opened with these connection PRAGMAs:

PRAGMAValueWhy
journal_modeWALbetter concurrent read/write behavior
foreign_keysONenforce cascades; OFF by default in SQLite, so set per-connection
busy_timeout5000wait up to 5s on a locked database before failing
synchronousNORMALdurable enough under WAL, faster than FULL

The authoritative schema version lives in a _migrations table — one row per applied migration (version, applied_at). Migrations run forward only, each inside its own transaction together with its bookkeeping insert.

After migrations run, the version is mirrored into project_meta.schema_version purely as a human-readable convenience. When the two ever disagree, _migrations is the source of truth.

TableHolds
project_metaone row of project identity: root, name, timestamps, last scan, mirrored schema version
filesone row per tracked file: path, relative path, language, size, sha256 hash, is_deleted flag, importance score
file_chunksline-range chunks of indexed file content (start/end line, content, per-chunk hash); cascades from files
symbolsextracted symbols (name, kind, language, line range, signature); cascades from files
memoriespersistent project memory: type, title, content, tags, confidence, importance, timestamps, expires_at, archived_at
taskswork items: title, description, status, priority, related files/memories (JSON), completed_at
scan_runsone row per kundun scan: counts (scanned/indexed/skipped), errors, duration, status
cleanup_runsone row per real kundun cleanup: removed counts, whether VACUUM ran, duration, status

file_chunks and symbols both carry a file_id foreign key with ON DELETE CASCADE, so deleting a files row automatically removes its chunks and symbols. This is what lets the cleanup engine drop an old deleted file and have its derived rows disappear in the same transaction.

When FTS5 is compiled into the SQLite build, two virtual tables are created:

  • chunks_fts — full-text index over chunk content (with file_id and chunk_id stored unindexed), used by kundun search.
  • memories_fts — full-text index over memory title, content, and tags, used by kundun memory search.

These tables are populated by explicit writes from the engines, never by SQLite triggers. When FTS5 is unavailable, the tables are simply not created and both search paths fall back to LIKE. The active mode (fts5 or like) is shown in command output.

Everything Kundun-Agent writes for a project lives under .kundun/ at the project root:

.kundun/
kundun.sqlite the single SQLite database (all tables + FTS5)
config.json a mirror of kundun.config.json
cache/ scratch/cache space
logs/ log files (subject to cleanup retention)
snapshots/ snapshot space
runtime/ runtime scratch (no token file in MVP 1)

The kundun.config.json file itself lives at the project root (next to .kundun/), not inside it; .kundun/config.json is only a mirror.

Because all state is a single SQLite file plus a few scratch directories, a project’s Kundun data is trivially portable and trivially disposable: delete .kundun/ (and the config) to remove every trace, or copy it to move the index elsewhere.