Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Workspace and Crates

Relevant source files

Purpose and Scope

This document details the Cargo workspace structure and the 15+ crates that comprise the LLKV database system. Each crate is designed with a single responsibility and well-defined interfaces, enabling independent testing and evolution of components. This page catalogs the role of each crate, their internal dependencies, and how they map to the system's layered architecture described in Architecture.

For information about how SQL queries flow through these crates, see SQL Query Processing Pipeline. For details on specific subsystems like storage or transactions, refer to sections 7 and following.


Workspace Overview

The LLKV workspace is defined in Cargo.toml:67-88 and contains 18 member crates organized into core system components, specialized operations, testing infrastructure, and demonstration applications.

Workspace Structure:

graph TB
    subgraph "Core System Crates"
        LLKV["llkv\n(main entry)"]
SQL["llkv-sql\n(SQL interface)"]
PLAN["llkv-plan\n(query plans)"]
EXPR["llkv-expr\n(expression AST)"]
RUNTIME["llkv-runtime\n(orchestration)"]
EXECUTOR["llkv-executor\n(execution)"]
TABLE["llkv-table\n(table layer)"]
COLMAP["llkv-column-map\n(column store)"]
STORAGE["llkv-storage\n(storage abstraction)"]
TXN["llkv-transaction\n(MVCC manager)"]
RESULT["llkv-result\n(error types)"]
end
    
    subgraph "Specialized Operations"
        AGG["llkv-aggregate\n(aggregation)"]
JOIN["llkv-join\n(joins)"]
CSV["llkv-csv\n(CSV import)"]
end
    
    subgraph "Testing Infrastructure"
        SLT["llkv-slt-tester\n(SQL logic tests)"]
TESTUTIL["llkv-test-utils\n(test utilities)"]
TPCH["llkv-tpch\n(TPC-H benchmarks)"]
end
    
    subgraph "Demonstrations"
        DEMO["llkv-sql-pong-demo\n(interactive demo)"]
end

Sources: Cargo.toml:67-88


Core System Crates

llkv

Purpose: Main library crate that re-exports the primary user-facing APIs from llkv-sql and llkv-runtime.

Key Dependencies: llkv-sql, llkv-runtime

Responsibilities:

  • Provides the consolidated API surface for embedding LLKV
  • Re-exports SqlEngine for SQL query execution
  • Re-exports runtime components for programmatic database access

Sources: Cargo.toml:9-10


llkv-sql

Path: llkv-sql/

Purpose: SQL interface layer that parses SQL statements, preprocesses dialect-specific syntax, and translates them into typed query plans.

Key Dependencies:

  • llkv-plan - Query plan structures
  • llkv-expr - Expression AST
  • llkv-runtime - Execution orchestration
  • sqlparser - SQL parsing (version 0.59.0)

Responsibilities:

  • SQL statement preprocessing for dialect compatibility
  • AST-to-plan translation
  • INSERT statement buffering optimization
  • SQL query result formatting

Primary Types:

  • SqlEngine - Main query interface

Sources: Cargo.toml21 llkv-plan/src/lib.rs:1-38


llkv-plan

Path: llkv-plan/

Purpose: Query planner that defines typed plan structures representing SQL operations.

Key Dependencies:

  • llkv-expr - Expression types
  • llkv-result - Error handling
  • sqlparser - SQL AST types

Responsibilities:

  • Plan structure definitions (SelectPlan, InsertPlan, UpdatePlan, DeletePlan)
  • SQL-to-plan conversion utilities
  • Subquery correlation tracking
  • Plan graph serialization for debugging

Primary Types:

  • SelectPlan, InsertPlan, UpdatePlan, DeletePlan, CreateTablePlan
  • SubqueryCorrelatedTracker
  • RangeSelectRows - Range-based row selection

Sources: llkv-plan/Cargo.toml:1-28 llkv-plan/src/lib.rs:1-38


llkv-expr

Path: llkv-expr/

Purpose: Expression AST definitions and literal value handling, independent of concrete Arrow scalar types.

Key Dependencies:

  • arrow - Arrow data types

Responsibilities:

  • Expression AST (Expr<T>, ScalarExpr<T>)
  • Literal value representation (Literal enum)
  • Type-aware predicate compilation (typed_predicate)
  • Decimal value handling

Primary Types:

  • Expr<T> - Generic expression with field identifier type parameter
  • ScalarExpr<T> - Scalar expressions
  • Literal - Untyped literal values
  • DecimalValue - Fixed-precision decimal
  • IntervalValue - Calendar interval

Sources: llkv-expr/Cargo.toml:1-19 llkv-expr/src/lib.rs:1-21 llkv-expr/src/literal.rs:1-446


llkv-runtime

Path: llkv-runtime/

Purpose: Runtime orchestration layer providing MVCC transaction management, session handling, and system catalog coordination.

Key Dependencies:

  • llkv-executor - Query execution
  • llkv-table - Table operations
  • llkv-transaction - MVCC snapshots

Responsibilities:

  • Transaction lifecycle management
  • Session state tracking
  • System catalog access
  • Query plan execution coordination
  • MVCC snapshot creation and cleanup

Primary Types:

  • RuntimeContext - Main runtime state
  • Session - Per-connection state

Sources: Cargo.toml19


llkv-executor

Path: llkv-executor/

Purpose: Query execution engine that evaluates plans and produces streaming results.

Key Dependencies:

  • llkv-plan - Plan structures
  • llkv-expr - Expression evaluation
  • llkv-table - Table scans
  • llkv-aggregate - Aggregation
  • llkv-join - Join algorithms

Responsibilities:

  • SELECT plan execution
  • Projection and filtering
  • Aggregation coordination
  • Join execution
  • Streaming RecordBatch production

Sources: Cargo.toml14


llkv-table

Path: llkv-table/

Purpose: Schema-aware table abstraction providing high-level data operations over columnar storage.

Key Dependencies:

  • llkv-column-map - Column storage
  • llkv-expr - Predicate compilation
  • llkv-storage - Storage backend
  • arrow - RecordBatch representation

Responsibilities:

  • Schema validation and enforcement
  • MVCC metadata injection (row_id, created_by, deleted_by)
  • Predicate compilation and optimization
  • RecordBatch append/scan operations
  • Column data type management

Primary Types:

  • Table - Main table interface
  • TablePlanner - Query optimization
  • TableExecutor - Execution strategies

Sources: llkv-table/Cargo.toml:1-60 llkv-column-map/src/store/projection.rs:1-728


llkv-column-map

Path: llkv-column-map/

Purpose: Columnar storage layer that chunks Arrow arrays and manages the mapping from logical fields to physical storage keys.

Key Dependencies:

  • llkv-storage - Pager abstraction
  • llkv-expr - Field identifiers
  • arrow - Array serialization

Responsibilities:

  • Column chunk management (serialization/deserialization)
  • LogicalFieldId → PhysicalKey mapping
  • Multi-column gather operations with caching
  • Row visibility filtering
  • Chunk metadata tracking (min/max values)

Primary Types:

  • ColumnStore<P> - Main storage interface
  • LogicalFieldId - Namespaced field identifier
  • MultiGatherContext - Reusable context for multi-column reads
  • GatherNullPolicy - Null handling strategies

Sources: Cargo.toml12 llkv-column-map/src/store/projection.rs:38-227


llkv-storage

Path: llkv-storage/

Purpose: Storage abstraction layer defining the Pager trait and providing implementations for in-memory and persistent backends.

Key Dependencies:

  • simd-r-drive - SIMD-optimized persistent storage (optional)
  • arrow - Buffer types

Responsibilities:

  • Pager trait definition (batch_get/batch_put)
  • Zero-copy array serialization format
  • MemPager - In-memory HashMap backend
  • SimdRDrivePager - Memory-mapped persistent backend
  • Physical key allocation

Primary Types:

Sources: Cargo.toml22 llkv-storage/src/serialization.rs:1-130


llkv-transaction

Path: llkv-transaction/

Purpose: MVCC transaction manager providing snapshot isolation and row visibility determination.

Key Dependencies:

  • llkv-result - Error types

Responsibilities:

  • Transaction ID allocation
  • MVCC snapshot creation
  • Commit watermark tracking
  • Row visibility rules enforcement

Primary Types:

  • TransactionManager
  • Snapshot - Transaction isolation view
  • TxnId - Transaction identifier

Sources: Cargo.toml25


llkv-result

Path: llkv-result/

Purpose: Common error and result types used throughout the system.

Key Dependencies: None (foundational crate)

Responsibilities:

  • Error enum with all error variants
  • Result<T> type alias
  • Error conversion traits

Sources: Cargo.toml18


Specialized Operations Crates

llkv-aggregate

Path: llkv-aggregate/

Purpose: Aggregate function evaluation including accumulators and distinct value tracking.

Key Dependencies:

  • arrow - Array operations

Responsibilities:

  • Aggregate function implementations (SUM, AVG, COUNT, MIN, MAX)
  • Accumulator state management
  • DISTINCT value tracking
  • Group-by hash table operations

Sources: Cargo.toml11


llkv-join

Path: llkv-join/

Purpose: Join algorithm implementations.

Key Dependencies:

  • arrow - RecordBatch operations
  • llkv-expr - Join predicates

Responsibilities:

  • Hash join implementation
  • Nested loop join
  • Join key extraction
  • Result materialization

Sources: Cargo.toml16


llkv-csv

Path: llkv-csv/

Purpose: CSV file ingestion and export utilities.

Key Dependencies:

  • llkv-table - Table operations
  • arrow - CSV reader integration

Responsibilities:

  • CSV to RecordBatch conversion
  • Bulk insert optimization
  • Schema inference from CSV headers

Sources: Cargo.toml13


Testing Infrastructure Crates

llkv-test-utils

Path: llkv-test-utils/

Purpose: Shared test utilities including tracing setup and common test fixtures.

Key Dependencies:

  • tracing-subscriber - Logging configuration

Responsibilities:

  • Consistent tracing initialization across tests
  • Common test helpers
  • Auto-initialization feature for convenience

Sources: Cargo.toml24


llkv-slt-tester

Path: llkv-slt-tester/

Purpose: SQL Logic Test runner providing standardized correctness testing.

Key Dependencies:

  • llkv-sql - SQL execution
  • sqllogictest - Test framework (version 0.28.4)

Responsibilities:

  • .slt file discovery and execution
  • Remote test suite fetching (.slturl files)
  • Test result comparison
  • AsyncDB adapter for LLKV

Primary Types:

  • LlkvSltRunner - Test runner
  • EngineHarness - Adapter interface

Sources: Cargo.toml20


llkv-tpch

Path: llkv-tpch/

Purpose: TPC-H benchmark suite for performance testing.

Key Dependencies:

  • llkv - Database interface
  • llkv-sql - SQL execution
  • tpchgen - Data generation (version 2.0.1)

Responsibilities:

  • TPC-H data generation at various scale factors
  • Query execution (Q1-Q22)
  • Performance measurement
  • Benchmark result reporting

Sources: Cargo.toml62


Demonstration Applications

llkv-sql-pong-demo

Path: demos/llkv-sql-pong-demo/

Purpose: Interactive demonstration showing LLKV's SQL capabilities through a Pong game implemented in SQL.

Key Dependencies:

  • llkv-sql - SQL execution
  • crossterm - Terminal UI (version 0.29.0)

Responsibilities:

  • Terminal-based interactive interface
  • Real-time SQL query execution
  • Game state management via SQL tables
  • User input handling
graph LR
    LLKV["llkv"]
SQL["llkv-sql"]
PLAN["llkv-plan"]
EXPR["llkv-expr"]
RUNTIME["llkv-runtime"]
EXECUTOR["llkv-executor"]
TABLE["llkv-table"]
COLMAP["llkv-column-map"]
STORAGE["llkv-storage"]
TXN["llkv-transaction"]
RESULT["llkv-result"]
AGG["llkv-aggregate"]
JOIN["llkv-join"]
CSV["llkv-csv"]
SLT["llkv-slt-tester"]
TESTUTIL["llkv-test-utils"]
TPCH["llkv-tpch"]
DEMO["llkv-sql-pong-demo"]
LLKV --> SQL
 
   LLKV --> RUNTIME
    
 
   SQL --> PLAN
 
   SQL --> EXPR
 
   SQL --> RUNTIME
 
   SQL --> EXECUTOR
 
   SQL --> TABLE
 
   SQL --> TXN
    
 
   RUNTIME --> EXECUTOR
 
   RUNTIME --> TABLE
 
   RUNTIME --> TXN
    
 
   EXECUTOR --> PLAN
 
   EXECUTOR --> EXPR
 
   EXECUTOR --> TABLE
 
   EXECUTOR --> AGG
 
   EXECUTOR --> JOIN
    
 
   TABLE --> COLMAP
 
   TABLE --> EXPR
 
   TABLE --> PLAN
 
   TABLE --> STORAGE
    
 
   COLMAP --> STORAGE
 
   COLMAP --> EXPR
    
 
   PLAN --> EXPR
 
   PLAN --> RESULT
    
 
   CSV --> TABLE
    
 
   TXN --> RESULT
 
   STORAGE --> RESULT
 
   EXPR --> RESULT
 
   COLMAP --> RESULT
 
   TABLE --> RESULT
    
 
   SLT --> SQL
 
   SLT --> RUNTIME
 
   SLT --> TESTUTIL
    
 
   TPCH --> LLKV
 
   TPCH --> SQL
    
 
   DEMO --> SQL

Sources: Cargo.toml86


Crate Dependency Graph

The following diagram shows the direct dependencies between workspace crates. Arrows point from dependent crates to their dependencies.

Crate Dependencies:

Sources: Cargo.toml:9-25 llkv-table/Cargo.toml:14-31 llkv-plan/Cargo.toml:14-24

Key Observations:

  1. llkv-result is a foundational crate with no internal dependencies, consumed by nearly all other crates for error handling.

  2. llkv-expr depends only on llkv-result, making it a stable base for expression handling across the system.

  3. llkv-plan builds on llkv-expr and adds plan-specific structures.

  4. llkv-storage and llkv-transaction** are independent of each other, allowing flexibility in storage backend selection.

  5. llkv-table integrates storage, expressions, and planning to provide a cohesive data layer.

  6. llkv-executor coordinates specialized operations (aggregate, join) and table access.

  7. llkv-runtime sits at the top of the execution stack, orchestrating transactions and query execution.

  8. llkv-sql ties together all layers to provide the SQL interface.


Mapping Crates to System Layers

This diagram shows how workspace crates map to the architectural layers described in Architecture.

Layered Architecture Mapping:

Sources: Cargo.toml:67-88


External Dependencies

The workspace declares several critical external dependencies that enable core functionality.

Apache Arrow Ecosystem

Version: 57.0.0

Crates:

  • arrow - Core Arrow functionality with prettyprint and IPC features
  • arrow-array - Array implementations
  • arrow-schema - Schema types
  • arrow-buffer - Buffer management
  • arrow-ord - Ordering operations

Usage: Arrow provides the universal columnar data format throughout LLKV. RecordBatch is used as the data interchange format at every layer, enabling zero-copy operations and SIMD-friendly processing.

Sources: Cargo.toml:32-36


SQL Parsing

Crate: sqlparser
Version: 0.59.0

Usage: Parses SQL text into AST nodes. Used by llkv-sql and llkv-plan to convert SQL queries into typed plan structures.

Sources: Cargo.toml52


SIMD-Optimized Storage

Crate: simd-r-drive
Version: 0.15.5-alpha

Usage: Provides memory-mapped, SIMD-accelerated persistent storage backend. The SimdRDrivePager implementation in llkv-storage uses this for zero-copy array access.

Related: simd-r-drive-entry-handle for Arrow buffer integration

Sources: Cargo.toml:26-27


Testing and Benchmarking

Key Dependencies:

CrateVersionPurpose
criterion0.7.0Performance benchmarking
sqllogictest0.28.4SQL correctness testing
tpchgen2.0.1TPC-H data generation
libtest-mimic0.8Custom test harness

Sources: Cargo.toml:40-62


Utilities

Key Dependencies:

CrateVersionPurpose
rayon1.10.0Data parallelism
rustc-hash2.1.1Fast hash maps
bitcode0.6.7Binary serialization
thiserror2.0.17Error trait derivation
serde1.0.228Serialization framework

Sources: Cargo.toml:37-64


Workspace Configuration

The workspace is configured with shared package metadata and dependency versions to ensure consistency across all crates.

Shared Package Metadata:

Build Settings:

  • Edition: 2024 (Rust 2024 edition)
  • Resolver: Version 2 (new dependency resolver)
  • Version: 0.8.2-alpha (all crates share this version)

Sources: Cargo.toml:1-8 Cargo.toml88


Summary Table

CrateLayerPrimary ResponsibilityKey Dependencies
llkvEntry PointMain library APIllkv-sql, llkv-runtime
llkv-sqlSQL ProcessingSQL parsing and executionllkv-plan, llkv-runtime, sqlparser
llkv-planSQL ProcessingQuery plan structuresllkv-expr, sqlparser
llkv-exprSQL ProcessingExpression ASTarrow
llkv-runtimeExecutionTransaction orchestrationllkv-executor, llkv-table
llkv-executorExecutionQuery executionllkv-table, llkv-aggregate
llkv-tableData ManagementSchema-aware tablesllkv-column-map, llkv-storage
llkv-column-mapData ManagementColumnar storagellkv-storage, arrow
llkv-storageStorageStorage abstractionsimd-r-drive (optional)
llkv-transactionData ManagementMVCC manager-
llkv-aggregateSpecialized OpsAggregation functionsarrow
llkv-joinSpecialized OpsJoin algorithmsarrow
llkv-csvSpecialized OpsCSV import/exportllkv-table
llkv-resultFoundationError types-
llkv-test-utilsTestingTest utilitiestracing-subscriber
llkv-slt-testerTestingSQL logic testsllkv-sql, sqllogictest
llkv-tpchTestingTPC-H benchmarksllkv-sql, tpchgen
llkv-sql-pong-demoDemoInteractive demollkv-sql, crossterm

Sources: Cargo.toml:1-89