This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
Query Execution
Relevant source files
- llkv-executor/Cargo.toml
- llkv-executor/src/lib.rs
- llkv-plan/src/plans.rs
- llkv-table/src/planner/mod.rs
- llkv-table/src/scalar_eval.rs
Purpose and Scope
Query execution is the process of converting logical query plans into physical result sets by coordinating table scans, expression evaluation, aggregation, joins, and result streaming. This page documents the execution engine's architecture, core components, and high-level execution flow.
For details on table-level planning and execution, see TablePlanner and TableExecutor. For scan optimization strategies, see Scan Execution and Optimization. For predicate evaluation mechanics, see Filter Evaluation.
System Architecture
Query execution spans two primary crates:
| Crate | Responsibility | Key Types |
|---|---|---|
llkv-executor | Orchestrates multi-table queries, aggregates, and result formatting | QueryExecutor, SelectExecution |
llkv-table | Executes table scans with predicates and projections | TablePlanner, TableExecutor |
The executor operates on logical plans produced by llkv-plan and delegates to llkv-table for single-table operations, llkv-join for join algorithms, and llkv-aggregate for aggregate computations.
Execution Architecture
graph TB
subgraph "Plan Layer"
PLAN["SelectPlan\n(llkv-plan)"]
end
subgraph "Execution Orchestration (llkv-executor)"
QE["QueryExecutor<P>"]
EXEC["SelectExecution<P>"]
STRAT["Strategy Selection:\nprojection, aggregate,\njoin, compound"]
end
subgraph "Table Execution (llkv-table)"
TP["TablePlanner"]
TE["TableExecutor"]
PS["PlannedScan"]
end
subgraph "Specialized Operations"
AGG["llkv-aggregate\nAccumulator"]
JOIN["llkv-join\nhash_join, cross_join"]
EVAL["NumericKernels\nscalar evaluation"]
end
subgraph "Storage"
TABLE["Table<P>"]
STORE["ColumnStore"]
end
PLAN --> QE
QE --> STRAT
STRAT -->|single table| TP
STRAT -->|aggregates| AGG
STRAT -->|joins| JOIN
TP --> PS
PS --> TE
TE --> TABLE
TABLE --> STORE
STRAT --> EVAL
EVAL --> TABLE
QE --> EXEC
TE --> EXEC
AGG --> EXEC
JOIN --> EXEC
Sources: llkv-executor/src/lib.rs:507-521 llkv-table/src/planner/mod.rs:580-726
Core Components
QueryExecutor
QueryExecutor<P> is the top-level execution coordinator in llkv-executor. It consumes SelectPlan structures and produces SelectExecution result containers.
Key Responsibilities:
- Strategy selection based on plan characteristics (single table, joins, aggregates, compound operations)
- Multi-table query orchestration (cross products, hash joins)
- Aggregate computation coordination
- Subquery evaluation (correlated EXISTS, scalar subqueries)
- Result streaming and batching
Entry points:
execute_select(plan: SelectPlan)- Execute a SELECT plan llkv-executor/src/lib.rs:523-525execute_select_with_filter(plan, row_filter)- Execute with optional MVCC filter llkv-executor/src/lib.rs:527-569
Sources: llkv-executor/src/lib.rs:507-521
SelectExecution
SelectExecution<P> encapsulates query results and provides streaming access via batched iteration. Results may be materialized upfront or generated lazily depending on the execution strategy.
Streaming Interface:
stream<F>(on_batch: F)- Process results batch-by-batchinto_rows()- Materialize all rows into memory (for sorting, deduplication)schema()- Access result schema
Sources: llkv-executor/src/lib.rs:2500-2700 (approximate location based on file structure)
TablePlanner and TableExecutor
The table-level execution layer handles single-table scans with predicates and projections. TablePlanner analyzes the request and produces a PlannedScan, which TableExecutor then executes.
Planning Process:
- Validate projections against schema
- Normalize filter predicates (apply De Morgan's laws, flatten boolean operators)
- Compile predicates into
EvalProgramandDomainProgrambytecode - Build
PlanGraphmetadata for tracing
Execution Process:
- Try optimized fast paths (single column scans, full table scans)
- Fall back to general execution with expression evaluation
- Stream results in batches
These components are detailed in TablePlanner and TableExecutor.
Sources: llkv-table/src/planner/mod.rs:580-637 llkv-table/src/planner/mod.rs:728-1007
Execution Flow
Top-Level SELECT Execution Sequence
Sources: llkv-executor/src/lib.rs:523-569 llkv-table/src/planner/mod.rs:595-607 llkv-table/src/planner/mod.rs:1009-1400
graph TD
START["SelectPlan"] --> COMPOUND{compound?}
COMPOUND -->|yes| EXEC_COMPOUND["execute_compound_select\nUNION/EXCEPT/INTERSECT"]
COMPOUND -->|no| FROM{tables.is_empty?}
FROM -->|yes| EXEC_CONST["execute_select_without_table\nEvaluate constant expressions"]
FROM -->|no| MULTI{tables.len > 1?}
MULTI -->|yes| EXEC_CROSS["execute_cross_product\nor hash_join optimization"]
MULTI -->|no| GROUPBY{group_by.is_empty?}
GROUPBY -->|no| EXEC_GROUP["execute_group_by_single_table\nGroup rows + compute aggregates"]
GROUPBY -->|yes| AGG{aggregates.is_empty?}
AGG -->|no| EXEC_AGG["execute_aggregates\nCollect all rows + compute"]
AGG -->|yes| COMPUTED{has_computed_aggregates?}
COMPUTED -->|yes| EXEC_COMP_AGG["execute_computed_aggregates\nEmbedded agg in expressions"]
COMPUTED -->|no| EXEC_PROJ["execute_projection\nStream scan with projections"]
EXEC_COMPOUND --> RESULT["SelectExecution"]
EXEC_CONST --> RESULT
EXEC_CROSS --> RESULT
EXEC_GROUP --> RESULT
EXEC_AGG --> RESULT
EXEC_COMP_AGG --> RESULT
EXEC_PROJ --> RESULT
Execution Strategies
QueryExecutor selects an execution strategy based on plan characteristics:
Strategy Decision Tree
Sources: llkv-executor/src/lib.rs:527-569
Strategy Implementations
| Strategy | Method | When Applied | Key Operations |
|---|---|---|---|
| Constant Evaluation | execute_select_without_table | No FROM clause | Evaluate literals, struct constructors |
| Simple Projection | execute_projection | Single table, no aggregates | Stream scan with filter + projections |
| Aggregation | execute_aggregates | Has aggregates, no GROUP BY | Collect all rows, compute aggregates, emit single row |
| Grouped Aggregation | execute_group_by_single_table | Has GROUP BY | Hash rows by key, compute per-group aggregates |
| Computed Aggregates | execute_computed_aggregates | Aggregates embedded in computed projections | Extract aggregate expressions, evaluate separately |
| Cross Product | execute_cross_product | Multiple tables | Cartesian product or hash join optimization |
| Compound | execute_compound_select | UNION/EXCEPT/INTERSECT | Execute components, apply set operations |
Sources: llkv-executor/src/lib.rs:926-975 llkv-executor/src/lib.rs:1700-2100 llkv-executor/src/lib.rs:2200-2400 llkv-executor/src/lib.rs:1057-1400 llkv-executor/src/lib.rs:590-701
Streaming Execution Model
LLKV executes queries in a streaming fashion to avoid materializing large intermediate results. Results flow through the system as RecordBatch chunks (typically 4096 rows).
Streaming Characteristics:
| Execution Type | Streaming Behavior | Memory Characteristics |
|---|---|---|
| Projection | Full streaming | O(batch_size) memory |
| Filter | Full streaming | O(batch_size) memory |
| Aggregates | Requires full materialization | O(input_rows) memory |
| GROUP BY | Requires full materialization | O(group_count) memory |
| ORDER BY | Requires full materialization | O(input_rows) memory |
| DISTINCT | Requires full materialization | O(distinct_rows) memory |
| LIMIT | Early termination | O(limit × batch_size) memory |
Streaming Projection Example Flow:
Sources: llkv-table/src/planner/mod.rs:1009-1400 llkv-table/src/constants.rs:1-10 (defines STREAM_BATCH_ROWS = 4096)
Materialization Points
Certain operations require collecting all rows before producing output:
- Sorting - Must see all rows to determine order llkv-executor/src/lib.rs:2800-2900
- Deduplication (DISTINCT) - Must track all seen rows llkv-executor/src/lib.rs:2950-3050
- Aggregation - Must accumulate state across all rows llkv-executor/src/lib.rs:1700-1900
- Set Operations - Must materialize both sides for comparison llkv-executor/src/lib.rs:590-701
These operations call into_rows() on SelectExecution to materialize results as Vec<CanonicalRow>.
Sources: llkv-executor/src/lib.rs:2600-2700
Expression Evaluation
Query execution evaluates two types of expressions:
Predicate Evaluation (Filtering)
Predicates are compiled to bytecode and evaluated during table scans:
- Normalization - Apply De Morgan's laws, flatten AND/OR llkv-table/src/planner/program.rs:50-150
- Compilation - Convert to
EvalProgram(stack-based) andDomainProgram(row tracking) llkv-table/src/planner/program.rs:200-400 - Vectorized Evaluation - Process chunks of rows efficiently llkv-table/src/planner/mod.rs:1100-1300
See Filter Evaluation for detailed mechanics.
Scalar Expression Evaluation (Projections)
Computed projections are evaluated row-by-row or vectorized when possible:
- Translation - Convert
ScalarExpr<String>toScalarExpr<FieldId>llkv-executor/src/translation/scalar.rs:1-200 - Type Inference - Determine output data type llkv-executor/src/translation/schema.rs:50-150
- Evaluation - Use
NumericKernelsfor numeric operations llkv-table/src/scalar_eval.rs:450-685
Vectorized vs Row-by-Row:
Sources: llkv-table/src/scalar_eval.rs:675-712 llkv-table/src/scalar_eval.rs:549-673
Integration with Runtime
The execution layer coordinates with llkv-runtime for transaction and catalog management:
Runtime Integration Points:
| Operation | Runtime Responsibility | Executor Responsibility |
|---|---|---|
| Table Lookup | CatalogManager::table() | ExecutorTableProvider::get_table() |
| MVCC Filtering | Provide RowIdFilter with snapshot | Apply filter during scan |
| Transaction State | Track transaction ID, commit watermark | Include created_by/deleted_by in scans |
| Schema Resolution | Maintain system catalog | Translate column names to FieldId |
The ExecutorTableProvider trait abstracts runtime integration, allowing executor to remain runtime-agnostic.
Sources: llkv-executor/src/types.rs:100-200 llkv-runtime/src/catalog/mod.rs:50-150
Performance Characteristics
Execution performance depends on query characteristics and chosen strategy:
| Query Pattern | Typical Performance | Optimization Opportunities |
|---|---|---|
| SELECT * FROM t | ~1M rows/sec | Fast path: shadow column scan llkv-table/src/planner/mod.rs:765-821 |
| SELECT col FROM t WHERE pred | ~500K rows/sec | Predicate fusion llkv-table/src/planner/mod.rs:518-570 |
| Single-table aggregates | Full table scan | Column-only projections for aggregate inputs |
| Hash join (2 tables) | O(n + m) with O(n) memory | Smaller table as build side llkv-executor/src/lib.rs:1500-1700 |
| Cross product (n tables) | O(∏ row_counts) | Avoid if possible; rewrite to joins |
Sources: llkv-table/src/planner/mod.rs:738-856 llkv-executor/src/lib.rs:1082-1400