This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Catalog and Metadata Management
Relevant source files
- llkv-aggregate/README.md
- llkv-column-map/README.md
- llkv-csv/README.md
- llkv-expr/README.md
- llkv-join/README.md
- llkv-runtime/README.md
- llkv-sql/src/sql_engine.rs
- llkv-storage/README.md
- llkv-table/README.md
Purpose and Scope
This document describes LLKV's metadata management infrastructure, including how table schemas, column definitions, and type information are persisted and accessed throughout the system. The catalog serves as the authoritative source for all schema information and coordinates with the storage layer to ensure crash consistency for metadata changes.
For details on specific catalog APIs, see CatalogManager API. For information on how metadata is physically stored, see System Catalog and SysCatalog. For type alias management, see Custom Types and Type Registry.
System Catalog Architecture
LLKV implements a self-hosting catalog where metadata is stored as regular data within the system. The system catalog, referred to as SysCatalog, is physically stored as table 0 and uses the same Arrow-based columnar storage infrastructure as user tables. This design provides several advantages:
- Crash consistency : Metadata changes use the same transactional append path as data, ensuring atomic schema modifications.
- MVCC for metadata : Schema changes are versioned alongside data using the same
created_byanddeleted_bycolumns. - Unified storage : No special-case persistence logic is required for metadata versus data.
- Bootstrap simplicity : The catalog table itself can be opened using minimal hardcoded schema information.
Sources : llkv-table/README.md:27-29 llkv-runtime/README.md:37-40
graph TB
subgraph "SQL Layer"
SQLENG["SqlEngine"]
end
subgraph "Runtime Layer"
RUNTIME["RuntimeEngine"]
RTCONTEXT["RuntimeContext"]
end
subgraph "Catalog Layer"
CATMGR["CatalogManager"]
SYSCAT["SysCatalog\n(Table 0)"]
TYPEREG["TypeRegistry"]
RESOLVER["IdentifierResolver"]
end
subgraph "Table Layer"
TABLE["Table"]
TABLEMETA["TableMeta"]
COLMETA["ColMeta"]
end
subgraph "Storage Layer"
COLSTORE["ColumnStore"]
PAGER["Pager"]
end
SQLENG --> RUNTIME
RUNTIME --> RTCONTEXT
RTCONTEXT --> CATMGR
CATMGR --> SYSCAT
CATMGR --> TYPEREG
CATMGR --> RESOLVER
SYSCAT --> TABLE
TABLE --> COLSTORE
COLSTORE --> PAGER
TABLEMETA -.stored in.-> SYSCAT
COLMETA -.stored in.-> SYSCAT
RESOLVER -.queries.-> CATMGR
RUNTIME -.queries.-> RESOLVER
Metadata Storage Model
The catalog stores two primary metadata types as Arrow RecordBatches within table 0:
TableMeta Structure
TableMeta records describe each table's schema and properties:
- table_id : Unique identifier (
u32) - namespace_id : Namespace the table belongs to (
u32) - table_name : User-visible name (
String) - schema : Serialized Arrow
Schemadescribing columns and types - row_count : Approximate row count for query planning
- created_at : Timestamp of table creation
ColMeta Structure
ColMeta records describe individual columns within tables:
- table_id : Parent table reference (
u32) - field_id : Column identifier within the table (
u32) - field_name : Column name (
String) - data_type : Arrow
DataTypeserialization - nullable : Whether NULL values are permitted (
bool) - metadata : Key-value pairs for extended properties
graph LR
subgraph "Logical Metadata Model"
USERTABLE["User Table\nemployees"]
TABLEMETA["TableMeta\ntable_id=5\nname='employees'\nschema=..."]
COLMETA1["ColMeta\ntable_id=5\nfield_id=0\nname='id'\ntype=Int32"]
COLMETA2["ColMeta\ntable_id=5\nfield_id=1\nname='name'\ntype=Utf8"]
end
subgraph "Physical Storage"
SYSCATTABLE["SysCatalog Table 0"]
RECORDBATCH["RecordBatch\nwith MVCC columns"]
COLUMNCHUNKS["Column Chunks\nin ColumnStore"]
end
USERTABLE --> TABLEMETA
USERTABLE --> COLMETA1
USERTABLE --> COLMETA2
TABLEMETA --> RECORDBATCH
COLMETA1 --> RECORDBATCH
COLMETA2 --> RECORDBATCH
RECORDBATCH --> SYSCATTABLE
SYSCATTABLE --> COLUMNCHUNKS
Both metadata types include MVCC columns (row_id, created_by, deleted_by) to support transactional schema changes and time-travel queries over metadata history.
Sources : llkv-table/README.md:27-29 llkv-column-map/README.md:13-16
Catalog Manager
The CatalogManager provides the high-level API for catalog operations and coordinates between the SQL layer, runtime, and storage. Key responsibilities include:
- Table lifecycle : Create, drop, rename, and truncate operations
- Schema queries : Resolve table names to table IDs and field names to field IDs
- Type management : Register and resolve custom type aliases
- Namespace isolation : Maintain separate table namespaces for user data and temporary objects
- Identifier resolution : Translate qualified names (
schema.table.column) into physical identifiers
graph TB
subgraph "Catalog Manager Responsibilities"
LIFECYCLE["Table Lifecycle\ncreate/drop/rename"]
SCHEMAQUERY["Schema Queries\nname→id resolution"]
TYPEMGMT["Type Management\ncustom types/aliases"]
NAMESPACES["Namespace Isolation\nuser vs temporary"]
end
subgraph "Core Components"
CATMGR["CatalogManager"]
CACHE["In-Memory Cache\nTableMeta/ColMeta"]
TYPEREG["TypeRegistry"]
end
subgraph "Persistence"
SYSCAT["SysCatalog"]
APPENDPATH["Arrow Append Path"]
end
LIFECYCLE --> CATMGR
SCHEMAQUERY --> CATMGR
TYPEMGMT --> CATMGR
NAMESPACES --> CATMGR
CATMGR --> CACHE
CATMGR --> TYPEREG
CATMGR --> SYSCAT
SYSCAT --> APPENDPATH
The manager maintains an in-memory cache of metadata loaded from table 0 on startup and synchronizes changes back through the standard table append path.
Sources : llkv-runtime/README.md:37-40
Identifier Resolution
LLKV uses a multi-stage identifier resolution process to translate SQL names into physical storage keys:
Resolution Pipeline
- String names (
Expr<String>): SQL parser produces expressions with bare column names - Qualified resolution (
IdentifierResolver): Resolve names to specific tables considering scope and aliases - Field IDs (
Expr<FieldId>): Convert to numeric field identifiers for execution - Logical field IDs (
LogicalFieldId): Add namespace and table context for storage lookup - Physical keys (
PhysicalKey): Map to actual pager keys for column chunks
Sources : llkv-table/README.md:36-40 llkv-sql/src/sql_engine.rs36
graph LR
SQL["SQL String\n'SELECT name\nFROM users'"]
EXPRSTR["Expr<String>\nfield='name'"]
RESOLUTION["IdentifierResolver\ncontext + scope"]
EXPRFID["Expr<FieldId>\ntable_id=5\nfield_id=1"]
LOGICALFID["LogicalFieldId\nnamespace=0\ntable=5\nfield=1"]
PHYSKEY["PhysicalKey\nkey=0x1234"]
SQL --> EXPRSTR
EXPRSTR --> RESOLUTION
RESOLUTION --> EXPRFID
EXPRFID --> LOGICALFID
LOGICALFID --> PHYSKEY
Identifier Context
The IdentifierContext structure tracks available tables and columns within a query scope:
- Tracks visible tables and their aliases
- Maintains column availability for each table
- Handles nested contexts for subqueries
- Supports correlated column references across scope boundaries
The IdentifierResolver consults the catalog manager to build these contexts during query planning.
Sources : llkv-sql/src/sql_engine.rs36
Catalog Operations
CREATE TABLE Flow
When a CREATE TABLE statement executes, the following sequence occurs:
Sources : llkv-runtime/README.md:13-18 llkv-table/README.md:27-29
DROP TABLE Flow
Table deletion is implemented as a soft delete using MVCC:
- Mark the
TableMetarow as deleted by settingdeleted_byto the current transaction ID - Mark all associated
ColMetarows as deleted - The table's data remains physically present but invisible to queries observing later snapshots
- Background garbage collection can eventually reclaim space from dropped tables
This approach ensures that in-flight transactions using earlier snapshots can still access the table definition.
Sources : llkv-table/README.md:32-34
Type Registry
The TypeRegistry manages custom type aliases created with CREATE DOMAIN (or CREATE TYPE in DuckDB dialect):
Type Alias Storage
- Type definitions are stored alongside other metadata in the catalog
- Aliases map user-defined names to base Arrow
DataTypeinstances - Type resolution occurs during expression planning and column definition
- Nested type references are recursively resolved
Type Resolution Process
When a column is defined with a custom type:
- Parser produces type name as string
TypeRegistryresolves name to baseDataType- Column is stored with resolved base type
- Type alias is preserved in
ColMetametadata for introspection
Sources : llkv-sql/src/sql_engine.rs:639-657
Namespace Management
LLKV supports multiple namespaces to isolate different categories of tables:
| Namespace ID | Purpose | Lifetime | Storage |
|---|---|---|---|
| 0 (default) | User tables | Persistent | Main pager |
| 1 (temporary) | Temporary tables, staging | Transaction scope | MemPager |
| 2+ (custom) | Reserved for future use | Varies | Configurable |
graph TB
subgraph "Persistent Namespace (0)"
USERTBL1["users table"]
USERTBL2["orders table"]
SYSCAT["SysCatalog\n(table 0)"]
end
subgraph "Temporary Namespace (1)"
TEMPTBL1["#temp_results"]
TEMPTBL2["#staging_data"]
end
subgraph "Storage Backends"
MAINPAGER["BoxedPager\n(persistent)"]
MEMPAGER["MemPager\n(in-memory)"]
end
USERTBL1 --> MAINPAGER
USERTBL2 --> MAINPAGER
SYSCAT --> MAINPAGER
TEMPTBL1 --> MEMPAGER
TEMPTBL2 --> MEMPAGER
The TEMPORARY_NAMESPACE_ID constant identifies ephemeral tables created within transactions that should not persist beyond commit or rollback.
Sources : llkv-runtime/README.md:26-32 llkv-sql/src/sql_engine.rs26
Catalog Bootstrap
The system catalog faces a bootstrapping challenge: table 0 stores metadata for all tables, including itself. LLKV solves this with a two-phase initialization:
Phase 1: Hardcoded Schema
On first startup, the ColumnStore initializes with an empty catalog. When the runtime creates table 0, it uses a hardcoded schema definition for SysCatalog that includes the minimal fields needed to store TableMeta and ColMeta:
table_id(UInt32)table_name(Utf8)field_id(UInt32)field_name(Utf8)data_type(Utf8, serialized)- Standard MVCC columns
Phase 2: Self-Description
Once table 0 exists, the runtime appends metadata describing table 0 itself into table 0. Subsequent startups load the catalog by scanning table 0 using the hardcoded schema, then validate that the self-description matches.
This bootstrap approach ensures that:
- No external metadata files are required
- Catalog schema can evolve through standard migration paths
- The system remains self-contained within a single pager instance
Sources : llkv-column-map/README.md:36-40
Integration with Storage Layer
The catalog leverages the same storage infrastructure as user data:
Column Store Interaction
LogicalFieldIdencodes(namespace_id, table_id, field_id)to uniquely identify columns across all tables- The
ColumnStoremaintains a mapping fromLogicalFieldIdtoPhysicalKey - Catalog queries fetch metadata by scanning table 0 using standard
ColumnStreamAPIs - Metadata mutations append
RecordBatches throughColumnStore::append, ensuring ACID properties
MVCC for Metadata
Schema changes are transactional:
CREATE TABLEwithin a transaction remains invisible to other transactions until commitDROP TABLEmarks metadata as deleted without immediate physical removal- Concurrent transactions see consistent snapshots of the schema based on their transaction IDs
- Schema conflicts (e.g., duplicate table names) are detected during commit watermark advancement
Sources : llkv-column-map/README.md:19-29 llkv-table/README.md:32-34
Catalog Consistency
Several mechanisms ensure catalog consistency across failures and concurrent access:
Atomic Metadata Updates
All catalog changes (create, drop, alter) execute as atomic append operations. The ColumnStore::append method ensures either all metadata rows are written or none are, preventing partial schema states.
Conflict Detection
On transaction commit, the runtime validates that:
- No conflicting table names exist in the target namespace
- Referenced tables for foreign keys still exist
- Column types remain compatible with constraints
If conflicts are detected, the commit fails and the transaction rolls back, discarding staged metadata.
Recovery After Crash
Since metadata uses the same MVCC append path as data:
- Uncommitted metadata changes (transactions that never committed) remain invisible
- The catalog reflects the last successfully committed snapshot
- No separate recovery log or checkpoint is required for metadata
Sources : llkv-runtime/README.md:20-24
Performance Considerations
Metadata Caching
The CatalogManager caches frequently accessed metadata in memory:
- Table name → table ID mappings
- Table ID → schema mappings
- Field name → field ID mappings per table
- Custom type definitions
Cache invalidation occurs on:
- Explicit DDL operations (CREATE, DROP, ALTER)
- Transaction commit with staged schema changes
- Cross-session schema modifications (future: requires catalog versioning)
Scan Optimization
Metadata scans leverage the same optimizations as user data:
- Predicate pushdown to filter by
table_idorfield_id - Projection to fetch only required columns
- MVCC filtering to skip deleted entries
For common operations like "lookup table by name", the catalog manager maintains auxiliary indexes in memory to avoid full scans.
Sources : llkv-table/README.md:23-24