This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

System Catalog and SysCatalog

Relevant source files

Purpose and Scope

This document describes the system catalog infrastructure that stores and manages table and column metadata for LLKV. The system catalog treats metadata as first-class data, persisting it in table 0 using the same Arrow-based storage mechanisms that handle user data. This ensures crash consistency, enables transactional DDL operations, and simplifies the overall architecture by eliminating separate metadata storage layers.

For information about the higher-level catalog management API that orchestrates table lifecycle operations, see CatalogManager API. For details on custom type definitions and the type registry, see Custom Types and Type Registry.

System Catalog as Table 0

LLKV stores all table and column metadata in a special table with ID 0, known as the system catalog. This design leverages the existing storage infrastructure rather than introducing a separate metadata store.

Key Properties

Property	Description
Table ID	Always `0`, reserved at system initialization
Storage Format	Arrow `RecordBatch` with predefined schema
MVCC Semantics	Full transaction support with snapshot isolation
Persistence	Uses the same `ColumnStore` and `Pager` as user tables
Crash Safety	Metadata mutations are atomic through the append pipeline

The system catalog contains two types of metadata records:

Table Metadata (TableMeta): Defines table schemas, IDs, and names
Column Metadata (ColMeta): Describes individual columns within tables

graph TB
    subgraph "Metadata Storage Model"
        UserTables["User Tables\n(ID ≥ 1)"]
SysCatalog["System Catalog\n(Table 0)"]
TableMeta["TableMeta Records\n• table_id\n• table_name\n• schema"]
ColMeta["ColMeta Records\n• table_id\n• col_name\n• col_id\n• data_type"]
end
    
    subgraph "Storage Layer"
        ColumnStore["ColumnStore"]
Pager["Pager (MemPager/SimdRDrivePager)"]
end
    
 
   UserTables -->|described by| SysCatalog
 
   SysCatalog --> TableMeta
 
   SysCatalog --> ColMeta
    
 
   SysCatalog -->|persisted via| ColumnStore
 
   UserTables -->|persisted via| ColumnStore
 
   ColumnStore --> Pager
    
    style SysCatalog fill:#f9f9f9

Sources: llkv-table/README.md:28-29 llkv-column-map/README.md:10-16

Metadata Schema

The system catalog stores metadata using a predefined Arrow schema with the following structure:

TableMeta Schema

Field Name	Arrow Type	Description
`table_id`	`UInt32`	Unique identifier for the table
`table_name`	`Utf8`	Human-readable table name
`schema`	`Binary`	Serialized Arrow schema definition
`row_id`	`UInt64`	MVCC row identifier (auto-injected)
`created_by`	`UInt64`	Transaction ID that created this record
`deleted_by`	`UInt64`	Transaction ID that deleted this record (NULL if active)

ColMeta Schema

Field Name	Arrow Type	Description
`table_id`	`UInt32`	References the parent table
`col_id`	`UInt32`	Column identifier within the table
`col_name`	`Utf8`	Column name
`data_type`	`Utf8`	Arrow data type descriptor
`row_id`	`UInt64`	MVCC row identifier (auto-injected)
`created_by`	`UInt64`	Transaction ID that created this record
`deleted_by`	`UInt64`	Transaction ID that deleted this record (NULL if active)

Sources: llkv-table/README.md:13-17 Diagram 4 from high-level architecture

SysCatalog Implementation

The SysCatalog struct serves as the programmatic interface to the system catalog, providing methods to read and write metadata while abstracting the underlying Arrow storage details.

graph LR
    subgraph "SysCatalog Interface"
        SysCatalog["SysCatalog"]
CreateTable["create_table()"]
GetTable["get_table_meta()"]
ListTables["list_tables()"]
DropTable["drop_table()"]
CreateCol["create_column()"]
GetCol["get_column_meta()"]
ListCols["list_columns()"]
end
    
    subgraph "Storage Backend"
        Table0["Table (ID=0)"]
ColumnStore["ColumnStore"]
end
    
 
   SysCatalog --> CreateTable
 
   SysCatalog --> GetTable
 
   SysCatalog --> ListTables
 
   SysCatalog --> DropTable
 
   SysCatalog --> CreateCol
 
   SysCatalog --> GetCol
 
   SysCatalog --> ListCols
    
 
   CreateTable --> Table0
 
   GetTable --> Table0
 
   ListTables --> Table0
 
   DropTable --> Table0
 
   CreateCol --> Table0
 
   GetCol --> Table0
 
   ListCols --> Table0
    
 
   Table0 --> ColumnStore

Core Components

Sources: llkv-table/README.md:28-29 llkv-runtime/README.md39

Metadata Query Process

When the runtime queries the catalog (e.g., during SELECT planning), it follows this flow:

Sources: llkv-table/README.md:23-25 llkv-runtime/README.md:36-40

sequenceDiagram
    participant Runtime as RuntimeContext
    participant Catalog as SysCatalog
    participant Table0 as Table (ID=0)
    participant Store as ColumnStore
    
    Runtime->>Catalog: get_table_meta("users")
    
    Catalog->>Table0: scan_stream()\nWHERE table_name = 'users'
    Table0->>Store: ColumnStream with predicate
    Store->>Store: Apply MVCC filtering
    Store-->>Table0: RecordBatch
    
    Table0-->>Catalog: RecordBatch
    
    Note over Catalog: Deserialize TableMeta\nfrom Arrow batch
    
    Catalog-->>Runtime: TableMeta struct
    
    Runtime->>Catalog: list_columns(table_id)
    Catalog->>Table0: scan_stream()\nWHERE table_id = X
    Table0->>Store: ColumnStream with predicate
    Store-->>Table0: RecordBatch
    Table0-->>Catalog: RecordBatch
    
    Note over Catalog: Deserialize ColMeta\nrecords
    
    Catalog-->>Runtime: Vec<ColMeta>

Metadata Operations

DDL operations (CREATE TABLE, DROP TABLE, ALTER TABLE) modify the system catalog through the same transactional append pipeline used for INSERT statements.

graph TD
    ParseSQL["Parse SQL:\nCREATE TABLE users (...)"]
CreatePlan["CreateTablePlan"]
RuntimeExec["Runtime.execute_create_table()"]
ValidateSchema["Validate Schema"]
AllocTableID["Allocate table_id"]
BuildTableMeta["Build TableMeta RecordBatch"]
BuildColMeta["Build ColMeta RecordBatch"]
AppendTable["Table(0).append(TableMeta)"]
AppendCols["Table(0).append(ColMeta)"]
ColumnStore["ColumnStore.append()"]
CommitPager["Pager.batch_put()"]
ParseSQL --> CreatePlan
 
   CreatePlan --> RuntimeExec
 
   RuntimeExec --> ValidateSchema
 
   ValidateSchema --> AllocTableID
    
 
   AllocTableID --> BuildTableMeta
 
   AllocTableID --> BuildColMeta
    
 
   BuildTableMeta --> AppendTable
 
   BuildColMeta --> AppendCols
    
 
   AppendTable --> ColumnStore
 
   AppendCols --> ColumnStore
    
 
   ColumnStore --> CommitPager
    
    style AppendTable fill:#f9f9f9
    style AppendCols fill:#f9f9f9

CREATE TABLE Flow

Key Implementation Details:

Schema Validation : The runtime validates the Arrow schema before allocating resources
Table ID Allocation : Monotonically increasing IDs are assigned via CatalogManager
Atomic Append : Both TableMeta and all ColMeta records are appended in a single transaction
MVCC Tagging : The created_by column is set to the current transaction ID

Sources: llkv-runtime/README.md:36-40 llkv-table/README.md:22-24

graph TD
    DropPlan["DropTablePlan"]
RuntimeExec["Runtime.execute_drop_table()"]
LookupMeta["SysCatalog.get_table_meta()"]
CheckExists["Verify table exists"]
BuildDeleteMeta["Build RecordBatch:\n• table_id\n• deleted_by = current_txn"]
AppendDelete["Table(0).append(delete_batch)"]
ColumnStore["ColumnStore.append()"]
DropPlan --> RuntimeExec
 
   RuntimeExec --> LookupMeta
 
   LookupMeta --> CheckExists
    
 
   CheckExists --> BuildDeleteMeta
 
   BuildDeleteMeta --> AppendDelete
 
   AppendDelete --> ColumnStore
    
    style BuildDeleteMeta fill:#f9f9f9

DROP TABLE Flow

Dropping a table uses MVCC soft-delete semantics rather than physical deletion:

The deleted_by column is updated to mark the metadata as deleted. MVCC visibility rules ensure that:

Transactions with snapshots before the deletion still see the table
Transactions starting after the deletion do not see the table

Sources: llkv-table/README.md:32-34 Diagram 4 from high-level architecture

sequenceDiagram
    participant Main as main() or SqlEngine::new()
    participant Runtime as RuntimeContext::new()
    participant CatMgr as CatalogManager::new()
    participant Table as Table::open_or_create()
    participant Store as ColumnStore::open()
    participant Pager as Pager (MemPager/SimdRDrivePager)
    
    Main->>Runtime: new(pager)
    Runtime->>CatMgr: new(pager)
    
    CatMgr->>Store: open(pager, root_key)
    Store->>Pager: batch_get([root_key])
    
    alt Catalog Exists
        Pager-->>Store: Catalog data
        Store-->>CatMgr: ColumnStore (loaded)
        Note over CatMgr: Deserialize catalog entries\nelse First Run
        Pager-->>Store: NULL
        Store-->>CatMgr: ColumnStore (empty)
        CatMgr->>Table: open_or_create(table_id=0)
        Note over CatMgr: Create system catalog schema
        Table->>Store: Initialize table 0
        Store->>Pager: batch_put(catalog_schema)
    end
    
    CatMgr-->>Runtime: CatalogManager (initialized)
    Runtime-->>Main: RuntimeContext (ready)

Bootstrap Process

When LLKV initializes, the system catalog must bootstrap itself before any user operations can proceed.

Initialization Sequence

Bootstrap Steps:

Pager Initialization : The storage backend is opened (in-memory or persistent)
Catalog Discovery : The ColumnStore attempts to load the catalog from the pager root key
Schema Creation : If no catalog exists, table 0 is created with the predefined schema
Ready State : The runtime can now service DDL and DML operations

Sources: llkv-runtime/README.md:26-31 llkv-storage/README.md:12-16

graph TB
    subgraph "SQL Query Processing"
        ParsedSQL["Parsed SQL AST"]
SelectPlan["SelectPlan<String>"]
ResolvedPlan["SelectPlan<FieldId>"]
end
    
    subgraph "RuntimeContext"
        CatalogLookup["Catalog Lookup"]
FieldResolution["Field Name → FieldId\nResolution"]
SchemaValidation["Schema Validation"]
end
    
    subgraph "System Catalog"
        SysCatalog["SysCatalog"]
TableMetaCache["In-Memory Metadata Cache"]
end
    
 
   ParsedSQL --> SelectPlan
 
   SelectPlan --> CatalogLookup
    
 
   CatalogLookup --> SysCatalog
 
   SysCatalog --> TableMetaCache
    
 
   TableMetaCache --> FieldResolution
 
   FieldResolution --> SchemaValidation
 
   SchemaValidation --> ResolvedPlan

Integration with Runtime

The RuntimeContext uses the system catalog for all schema-dependent operations:

Schema Resolution Flow

Usage Examples

Operation	Catalog Interaction
SELECT	Resolve table names → table IDs, resolve column names → field IDs
INSERT	Validate schema compatibility, check for required columns
JOIN	Resolve schemas for both tables, validate join key compatibility
CREATE INDEX	(Future) Persist index metadata as new catalog record type
ALTER TABLE	Update existing metadata records with new schema definitions

Sources: llkv-runtime/README.md:36-40 llkv-expr/README.md:50-54

Dual-Context Catalog Access

During explicit transactions, the runtime maintains two catalog views:

Catalog Visibility Rules

Persistent Context : Sees only metadata committed before the transaction's snapshot
Staging Context : Sees tables created within the current transaction
On Commit : Staged metadata is replayed into the persistent context
On Rollback : Staged metadata is discarded

This dual-view approach ensures that:

DDL operations remain transactional
Uncommitted schema changes don't leak to other sessions
Catalog queries are snapshot-isolated like DML operations

Sources: llkv-runtime/README.md:26-31 llkv-table/README.md:32-34

Metadata Caching

The CatalogManager maintains an in-memory cache of frequently accessed metadata to avoid repeated scans of table 0:

Cache Structure	Purpose	Invalidation Strategy
Table Name → ID Map	Fast table resolution during planning	Invalidated on `CREATE`/`DROP TABLE`
Table ID → Schema Map	Quick schema validation during `INSERT`	Invalidated on `ALTER TABLE`
Column Name → FieldId Map	Field resolution for expressions	Rebuilt on schema changes

The cache is session-local and does not require cross-session synchronization in the current single-process model.

Sources: Inferred from llkv-runtime/README.md:12-17

Summary

The LLKV system catalog demonstrates the principle of treating metadata as data by storing all table and column definitions in table 0 using the same Arrow-based storage infrastructure that handles user tables. This design:

Simplifies Architecture : Eliminates the need for separate metadata storage systems
Ensures Consistency : Metadata mutations use MVCC transactions like all other data
Enables Crash Recovery : The pager's atomicity guarantees extend to schema changes
Supports Transactional DDL : Schema modifications can be rolled back or committed atomically

The SysCatalog interface abstracts the underlying Arrow storage, providing a type-safe API for the runtime to query and modify metadata. The bootstrap process ensures the system catalog exists before any user operations proceed, and the dual-context model enables proper transaction isolation for DDL operations.

Sources: llkv-table/README.md:28-29 llkv-runtime/README.md:36-40 llkv-column-map/README.md:10-16 Diagram 4 from high-level architecture

Keyboard shortcuts

rust-llkv Documentation