Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Architecture

Loading…

Architecture

Relevant source files

This page describes the overall architectural design of LLKV, including the layered system structure, key design decisions, and how major components interact. For detailed information about individual crates and their responsibilities, see Workspace and Crates. For the end-to-end query execution flow, see SQL Query Processing Pipeline. For details on Arrow integration and data representation, see Data Formats and Arrow Integration.

Architectural Overview

LLKV is a columnar SQL database that stores Apache Arrow data structures directly in a key-value persistence layer. The architecture consists of six distinct layers, each implemented as one or more Rust crates. The system translates SQL statements into query plans, executes those plans against columnar table storage, and persists data using memory-mapped key-value stores.

The core architectural innovation is the llkv-column-map layer, which bridges Apache Arrow’s in-memory columnar format with the simd-r-drive key-value storage backend. This design enables zero-copy operations on columnar data while maintaining ACID properties through the underlying storage engine.

Sources: Cargo.toml:1-109 high-level overview diagrams

System Layers

The following diagram shows the six architectural layers with their implementing crates and key data structures:

graph TB
    subgraph "Layer 1: User Interface"
        SQL["llkv-sql\nSqlEngine"]
DEMO["llkv-sql-pong-demo"]
TPCH["llkv-tpch"]
CSV["llkv-csv"]
end
    
    subgraph "Layer 2: Query Processing"
        PARSER["sqlparser-rs\nParser, Statement"]
PLANNER["llkv-plan\nSelectPlan, InsertPlan"]
EXPR["llkv-expr\nExpr, ScalarExpr"]
end
    
    subgraph "Layer 3: Execution"
        EXECUTOR["llkv-executor\nTableExecutor"]
RUNTIME["llkv-runtime\nDatabaseRuntime"]
AGGREGATE["llkv-aggregate\nAccumulator"]
JOIN["llkv-join\nhash_join"]
COMPUTE["llkv-compute\nNumericKernels"]
end
    
    subgraph "Layer 4: Data Management"
        TABLE["llkv-table\nTable, SysCatalog"]
TRANSACTION["llkv-transaction\nTransaction"]
SCAN["llkv-scan\nScanOp"]
end
    
    subgraph "Layer 5: Storage - Arrow Native"
        COLMAP["llkv-column-map\nColumnStore, ColumnDescriptor"]
STORAGE["llkv-storage\nPager trait"]
ARROW["arrow-array\nRecordBatch, ArrayRef"]
end
    
    subgraph "Layer 6: Persistence - Key-Value"
        PAGER["Pager implementations\nbatch_get, batch_put"]
SIMD["simd-r-drive\nRDrive, EntryHandle"]
end
    
 
   SQL --> PARSER
 
   PARSER --> PLANNER
 
   PLANNER --> EXPR
 
   EXPR --> EXECUTOR
    
 
   EXECUTOR --> AGGREGATE
 
   EXECUTOR --> JOIN
 
   EXECUTOR --> COMPUTE
 
   EXECUTOR --> RUNTIME
    
 
   RUNTIME --> TABLE
 
   RUNTIME --> TRANSACTION
 
   TABLE --> SCAN
    
 
   SCAN --> COLMAP
 
   TABLE --> COLMAP
 
   COLMAP --> ARROW
 
   COLMAP --> STORAGE
    
 
   STORAGE --> PAGER
 
   PAGER --> SIMD

Each layer has well-defined responsibilities. Layer 1 provides user-facing interfaces. Layer 2 translates SQL into executable plans. Layer 3 executes those plans using specialized operators. Layer 4 manages logical tables and transactions. Layer 5 implements columnar storage using Arrow data structures. Layer 6 provides persistent key-value storage with memory-mapping.

Sources: Cargo.toml:2-26 llkv-sql/Cargo.toml:1-45 llkv-executor/Cargo.toml:1-48 llkv-table/Cargo.toml:1-72

Key Architectural Decisions

Arrow-Native Columnar Storage

The system uses Apache Arrow as its native in-memory data format. All data is represented as RecordBatch instances containing ArrayRef columns. This design decision enables:

  • Zero-copy interoperability with Arrow-based analytics tools
  • Vectorized computation using Arrow kernels
  • Efficient memory layouts for SIMD operations
  • Type-safe column operations through Arrow’s schema system

The arrow dependency (version 57.1.0) provides the foundation for all data operations.

Key-Value Persistence Backend

Rather than implementing a custom storage engine, LLKV persists data through the Pager trait abstraction defined in llkv-storage. The primary implementation uses simd-r-drive (version 0.15.5-alpha), a memory-mapped key-value store with SIMD-optimized operations.

This design separates logical data management from physical storage concerns. The Pager trait defines operations:

  • alloc() - allocate new storage keys
  • batch_get() - retrieve multiple values
  • batch_put() - atomically write multiple key-value pairs
  • free() - release storage keys

Modular Crate Organization

The workspace contains 15+ specialized crates, each focused on a specific concern. This modularity enables:

  • Independent testing and benchmarking per crate
  • Clear dependency boundaries
  • Parallel development across subsystems
  • Selective feature compilation

Core crates follow a naming convention: llkv-{subsystem}. Foundational crates like llkv-types and llkv-result provide shared types and error handling used throughout the system.

MVCC Transaction Isolation

The system implements Multi-Version Concurrency Control through llkv-transaction. Each table row includes system columns created_by and deleted_by that track transaction visibility. The Transaction struct manages transaction state and visibility rules.

Sources: Cargo.toml:37-96 llkv-storage/Cargo.toml:1-48 llkv-table/Cargo.toml:20-41

Component Organization

The following diagram maps the workspace crates to their architectural roles:

Dependencies flow upward through the layers. Lower-level crates like llkv-types and llkv-storage have no dependencies on higher layers. The llkv-sql crate sits at the top, orchestrating all subsystems.

Sources: Cargo.toml:2-26 Cargo.toml:55-74

Data Flow Architecture

Data flows through the system in two primary patterns: write operations (INSERT, UPDATE, DELETE) and read operations (SELECT).

flowchart LR
 
   SQL["SQL Statement"] --> PARSE["sqlparser\nparse()"]
PARSE --> PLAN["llkv-plan\nInsertPlan"]
PLAN --> RUNTIME["DatabaseRuntime\nexecute_insert_plan()"]
RUNTIME --> TABLE["Table\nappend()"]
TABLE --> COLMAP["ColumnStore\nappend()"]
COLMAP --> CHUNK["Chunking\nLWW deduplication"]
CHUNK --> SERIAL["Serialize\nArrow arrays"]
SERIAL --> PAGER["Pager\nbatch_put()"]
PAGER --> KV["simd-r-drive\nEntryHandle"]

Write Path

Write operations follow a path from SQL parsing through plan creation, runtime execution, table operations, column store append, chunking/deduplication, serialization, and finally persistence via the Pager trait.

flowchart LR
 
   SQL["SQL Statement"] --> PARSE["sqlparser\nparse()"]
PARSE --> PLAN["llkv-plan\nSelectPlan"]
PLAN --> EXECUTOR["TableExecutor\nexecute_select()"]
EXECUTOR --> FILTER["Phase 1:\nfilter_row_ids()"]
FILTER --> GATHER["Phase 2:\ngather_rows()"]
GATHER --> COLMAP["ColumnStore\ngather()"]
COLMAP --> PAGER["Pager\nbatch_get()"]
PAGER --> DESER["Deserialize\nArrow arrays"]
DESER --> BATCH["RecordBatch\nassembly"]
BATCH --> RESULT["Query Results"]

The ColumnStore::append() method implements Last-Write-Wins semantics for upserts by detecting duplicate row IDs and replacing older values.

Read Path

Read operations use a two-phase approach: first collecting matching row IDs via predicate evaluation, then gathering column data for those rows. This minimizes data movement by filtering before gathering.

Sources: llkv-sql/Cargo.toml:20-38 llkv-executor/Cargo.toml:20-42 llkv-table/Cargo.toml:20-41

Storage Architecture: Arrow to Key-Value Bridge

The llkv-column-map crate implements the critical bridge between Arrow’s columnar format and key-value storage:

graph TB
    subgraph "Logical Layer"
        FIELD["LogicalFieldId\n(table_id, field_name)"]
ROWID["RowId\n(u64)"]
end
    
    subgraph "Column Organization"
        CATALOG["ColumnCatalog\nfield_id → ColumnDescriptor"]
DESC["ColumnDescriptor\nlinked list of chunks"]
META["ChunkMetadata\n(min, max, size, null_count)"]
end
    
    subgraph "Physical Storage"
        CHUNK["Data Chunks\nserialized Arrow arrays"]
RIDARRAY["RowId Arrays\nsorted u64 arrays"]
PKEY["Physical Keys\n(chunk_pk, rid_pk)"]
end
    
    subgraph "Key-Value Layer"
        ENTRY["EntryHandle\nbyte blobs"]
MMAP["Memory-Mapped Files"]
end
    
 
   FIELD --> CATALOG
 
   CATALOG --> DESC
 
   DESC --> META
 
   META --> CHUNK
 
   CHUNK --> PKEY
 
   DESC --> RIDARRAY
 
   RIDARRAY --> PKEY
 
   PKEY --> ENTRY
 
   ENTRY --> MMAP
    
    ROWID -.used for.-> RIDARRAY

The ColumnCatalog maps logical field identifiers to physical storage. Each column is represented by a ColumnDescriptor that maintains a linked list of data chunks. Each chunk contains:

  • A serialized Arrow array (the actual column data)
  • A corresponding sorted array of row IDs
  • Metadata including min/max values for predicate pushdown
  • Physical keys (chunk_pk, rid_pk) pointing to storage

The ChunkMetadata enables chunk pruning during scans: chunks whose min/max ranges don’t overlap with query predicates can be skipped entirely.

Data chunks are stored as serialized byte blobs accessed through EntryHandle instances from simd-r-drive. The storage layer uses memory-mapped files for efficient I/O.

Sources: llkv-column-map/Cargo.toml:1-65 llkv-storage/Cargo.toml:1-48 Cargo.toml:85-86

System Catalog

The system catalog (table ID 0) stores metadata about all tables, columns, indexes, and constraints. It is itself stored in the same ColumnStore as user data, creating a self-describing bootstrapped system.

The SysCatalog struct provides typed access to catalog tables. The CatalogManager coordinates table lifecycle operations (CREATE, ALTER, DROP) by manipulating catalog entries.

Table ID ranges partition the namespace:

  • ID 0: System catalog
  • IDs 1-999: User tables
  • IDs 1000+: Information schema views
  • IDs 10000+: Temporary tables

This design allows the catalog to leverage the same storage, transaction, and query infrastructure as user data.

Sources: llkv-table/Cargo.toml:20-41

Dismiss

Refresh this wiki

Enter email to refresh