This documentation is part of the "Projects with Books" initiative at zenOSmosis.
The source code for this project is available on GitHub.
TablePlanner and TableExecutor
Relevant source files
This document describes the table-level query planning and execution system in LLKV. The TablePlanner translates scan operations into optimized execution plans, while the TableExecutor implements multiple execution strategies to materialize query results efficiently. For information about the broader query execution pipeline, see Query Execution. For details on expression compilation and evaluation, see Program Compilation and Scalar Evaluation and NumericKernels.
Purpose and Scope
The TablePlanner and TableExecutor form the core of LLKV's table-level query execution. They bridge the gap between logical query plans (from llkv-plan) and physical data access (through llkv-column-map). This document covers:
- How scan operations are planned and optimized
- The structure of compiled execution plans (
PlannedScan) - Multiple execution strategies and their trade-offs
- Predicate fusion optimization
- Integration with MVCC row filtering
- Streaming result materialization
Architecture Overview
Sources: llkv-table/src/planner/mod.rs:580-726
TablePlanner
The TablePlanner is responsible for analyzing scan requests and constructing optimized execution plans. It does not execute queries itself but prepares all necessary metadata for the TableExecutor.
Structure
The planner holds a reference to the Table being scanned and provides a single public method: scan_stream_with_exprs.
Sources: llkv-table/src/planner/mod.rs:580-593
Planning Flow
The planning process consists of several stages:
- Validation : Ensures at least one projection is specified
- Normalization : Applies De Morgan's laws and flattens logical operators via
normalize_predicate - Graph Construction : Builds a
PlanGraphfor visualization and introspection - Program Compilation : Compiles filter expressions into
EvalProgramandDomainProgrambytecode
Sources: llkv-table/src/planner/mod.rs:595-637
PlanGraph Construction
The build_plan_graph method creates a directed acyclic graph (DAG) representing the logical query plan:
| Node Type | Purpose | Metadata |
|---|---|---|
TableScan | Entry point for data access | table_id, projection_count |
Filter | Predicate application | predicates (formatted expressions) |
Project | Column selection and computation | projections, fields with types |
Output | Result materialization | include_nulls flag |
Sources: llkv-table/src/planner/mod.rs:639-725
PlannedScan Structure
The PlannedScan is an intermediate representation that bridges planning and execution. It contains all information needed to execute a scan without holding runtime state.
| Field | Type | Purpose |
|---|---|---|
projections | Vec<ScanProjection> | Column and computed projections to materialize |
filter_expr | Arc<Expr<FieldId>> | Normalized filter predicate |
options | ScanStreamOptions<P> | MVCC filters, ordering, null handling |
plan_graph | PlanGraph | Logical plan for introspection |
programs | ProgramSet | Compiled bytecode for predicate evaluation |
Sources: llkv-table/src/planner/mod.rs:500-509
TableExecutor
The TableExecutor implements multiple execution strategies, selecting the most efficient based on query characteristics.
Structure
The executor maintains a cache of row IDs to avoid redundant scans when multiple operations target the same table.
Sources: llkv-table/src/planner/mod.rs:572-578
Execution Strategy Selection
Sources: llkv-table/src/planner/mod.rs:1009-1367
Single Column Direct Scan Fast Path
The try_single_column_direct_scan optimization applies when:
- Exactly one projection is requested
include_nullsisfalse- Filter is either trivial or a full-range predicate on the projected column
- Column type is not
Utf8orLargeUtf8(to avoid string complexity)
graph TD
CHECK1{projections.len == 1?} -->|No| FALLBACK1[Fallback]
CHECK1 -->|Yes| CHECK2{include_nulls == false?}
CHECK2 -->|No| FALLBACK2[Fallback]
CHECK2 -->|Yes| PROJ_TYPE{Projection type?}
PROJ_TYPE -->|Column| CHECK_FILTER["is_full_range_filter?"]
PROJ_TYPE -->|Computed| CHECK_COMPUTED[Single field?]
CHECK_FILTER -->|No| FALLBACK3[Fallback]
CHECK_FILTER -->|Yes| CHECK_DTYPE{dtype?}
CHECK_DTYPE -->|Utf8/LargeUtf8| FALLBACK4[Fallback]
CHECK_DTYPE -->|Other| DIRECT_SCAN[SingleColumnStreamVisitor]
CHECK_COMPUTED -->|No| FALLBACK5[Fallback]
CHECK_COMPUTED -->|Yes| COMPUTE_TYPE{Passthrough or Affine?}
COMPUTE_TYPE -->|Passthrough| DIRECT_SCAN2[SingleColumnStreamVisitor]
COMPUTE_TYPE -->|Affine| AFFINE_SCAN[AffineSingleColumnVisitor]
COMPUTE_TYPE -->|Other| COMPUTE_SCAN[ComputedSingleColumnVisitor]
DIRECT_SCAN --> HANDLED[StreamOutcome::Handled]
DIRECT_SCAN2 --> HANDLED
AFFINE_SCAN --> HANDLED
COMPUTE_SCAN --> HANDLED
This path streams data directly from storage using ScanBuilder without building intermediate row ID lists or using RowStreamBuilder.
Sources: llkv-table/src/planner/mod.rs:1369-1530
Full Table Scan Streaming Fast Path
The try_stream_full_table_scan optimization applies when:
- Filter is trivial (no predicates)
- No ordering is required (
options.order.is_none())
graph TD
CHECK_ORDER{order.is_some?} -->|Yes| FALLBACK[Fallback]
CHECK_ORDER -->|No| STREAM_START[stream_table_row_ids]
STREAM_START --> SHADOW{Shadow column exists?}
SHADOW -->|Yes| CHUNK[Emit row_id chunks]
SHADOW -->|No| FALLBACK2[Multi-column scan fallback]
CHUNK --> MVCC_FILTER{row_id_filter?}
MVCC_FILTER -->|Yes| APPLY[filter.filter]
MVCC_FILTER -->|No| BUILD
APPLY --> BUILD
BUILD[RowStreamBuilder] --> GATHER[Gather columns]
GATHER --> EMIT[Emit RecordBatch]
EMIT --> MORE{More chunks?}
MORE -->|Yes| CHUNK
MORE -->|No| CHECK_EMPTY
CHECK_EMPTY{any_emitted?} -->|No| SYNTHETIC[emit_synthetic_null_batch]
CHECK_EMPTY -->|Yes| HANDLED[StreamOutcome::Handled]
SYNTHETIC --> HANDLED
This path uses stream_table_row_ids to enumerate row IDs in chunks directly from the row_id shadow column, avoiding full predicate evaluation.
This optimization is particularly effective for queries like SELECT col1, col2 FROM table with no WHERE clause.
Sources: llkv-table/src/planner/mod.rs:905-999
General Execution Path
When fast paths don't apply, the executor follows a multi-stage process:
Stage 1: Projection Metadata Construction
The executor builds several data structures:
| Structure | Purpose |
|---|---|
projection_evals | Vec<ProjectionEval> mapping projections to evaluation strategies |
unique_lfids | Vec<LogicalFieldId> of columns to fetch from storage |
unique_index | FxHashMap<LogicalFieldId, usize> for column position lookup |
numeric_fields | FxHashSet<FieldId> of columns needing numeric coercion |
passthrough_fields | Vec<Option<FieldId>> for identity computed projections |
Sources: llkv-table/src/planner/mod.rs:1033-1117
Stage 2: Row ID Collection
For trivial filters, the executor scans the MVCC created_by column to enumerate all rows (including those with NULL values in user columns). For non-trivial filters, it evaluates the compiled ProgramSet.
Sources: llkv-table/src/planner/mod.rs:1269-1327
Stage 3: Streaming Execution
The RowStreamBuilder materializes results in batches of STREAM_BATCH_ROWS (default: 1024). For each batch:
- Gather : Fetch column data for row IDs via
MultiGatherContext - Evaluate : Compute any
ProjectionEval::Computedexpressions - Materialize : Construct
RecordBatchwith final schema - Emit : Call user-provided callback
Sources: llkv-table/src/planner/mod.rs:1337-1365
Program Compilation
The ProgramCompiler translates filter expressions into stack-based bytecode for efficient evaluation. It produces two programs:
- EvalProgram : Evaluates predicates and returns matching row IDs
- DomainProgram : Computes the domain (all potentially relevant row IDs) for NOT operations
ProgramSet Structure
Bytecode Operations
| Opcode | Stack Effect | Purpose |
|---|---|---|
PushPredicate(filter) | [] → [rows] | Evaluate single predicate |
PushCompare{left, op, right} | [] → [rows] | Evaluate comparison expression |
PushInList{expr, list, negated} | [] → [rows] | Evaluate IN list |
PushIsNull{expr, negated} | [] → [rows] | Evaluate IS NULL |
PushLiteral(bool) | [] → [rows] | Push all rows (true) or empty (false) |
FusedAnd{field_id, filters} | [] → [rows] | Apply multiple predicates on same field |
And{child_count} | [r1, r2, ...] → [r] | Intersect N row ID sets |
Or{child_count} | [r1, r2, ...] → [r] | Union N row ID sets |
Not{domain} | [matched] → [domain - matched] | Set difference using domain program |
Sources: llkv-table/src/planner/program.rs:1-200 (referenced but not in provided files)
Execution Example
Consider the filter: WHERE (age > 18 AND age < 65) OR name = 'Alice'
After normalization and compilation:
Stack Operations:
1. PushCompare(age > 18) → [rows1]
2. PushCompare(age < 65) → [rows1, rows2]
3. And{2} → [rows1 ∩ rows2]
4. PushPredicate(name = 'Alice') → [rows1 ∩ rows2, rows3]
5. Or{2} → [(rows1 ∩ rows2) ∪ rows3]
The collect_row_ids_for_program method executes this bytecode by maintaining a stack of row ID vectors and applying set operations.
Sources: llkv-table/src/planner/mod.rs:2376-2502
graph TD
ANALYZE[Analyze filter expression] --> BUILD["Build per_field stats"]
BUILD --> STATS["FieldPredicateStats:\ntotal, contains"]
STATS --> CHECK{should_fuse?}
CHECK --> DTYPE{Data type?}
DTYPE -->|Utf8/LargeUtf8| STRING_RULE["contains ≥ 1 AND total ≥ 2"]
DTYPE -->|Other| NUMERIC_RULE["total ≥ 2"]
STRING_RULE -->|Yes| FUSE[Generate FusedAnd opcode]
NUMERIC_RULE -->|Yes| FUSE
STRING_RULE -->|No| SEPARATE[Evaluate predicates separately]
NUMERIC_RULE -->|No| SEPARATE
Predicate Fusion
The PredicateFusionCache analyzes filter expressions to identify opportunities for fused predicate evaluation.
Fusion Strategy
Predicate fusion is particularly beneficial for string columns with CONTAINS operations, where multiple predicates on the same field can be evaluated in a single storage scan.
Example: WHERE name LIKE '%Smith%' AND name LIKE '%John%' can be fused into a single scan with two pattern matchers.
Sources: llkv-table/src/planner/mod.rs:517-570 llkv-table/src/planner/mod.rs:2504-2580
Projection Evaluation
The ProjectionEval enum handles two types of projections:
Column Projections
Direct column references that can be gathered from storage without computation.
Computed Projections
Expressions evaluated per-row using NumericKernels. The executor optimizes several patterns:
| Pattern | Optimization |
|---|---|
column | Passthrough (no computation) |
a * column + b | Affine transformation (vectorized) |
| General expression | Full expression evaluation |
Sources: llkv-table/src/planner/mod.rs:482-498 llkv-table/src/planner/mod.rs:1110-1117
Row ID Collection Strategies
The executor uses different strategies for collecting matching row IDs based on predicate characteristics:
Simple Predicates
For predicates on a single field (e.g., age > 18):
Sources: llkv-table/src/planner/mod.rs:1532-1612
Comparison Expressions
For comparisons involving multiple fields (e.g., col1 + col2 > 10):
This approach minimizes wasted computation by first identifying a "domain" of potentially matching rows (intersection of rows where all referenced columns have values) before evaluating the full expression.
Sources: llkv-table/src/planner/mod.rs:1699-1775 llkv-table/src/planner/mod.rs:2214-2374
IN List Expressions
For IN list predicates (e.g., status IN ('active', 'pending')):
The IN list evaluator properly handles SQL's three-valued logic:
- If value matches any list element:
TRUE(orFALSEif negated) - If no match but list contains NULL:
NULL(indeterminate) - If no match and no NULLs:
FALSE(orTRUEif negated)
Sources: llkv-table/src/planner/mod.rs:2001-2044 llkv-table/src/planner/mod.rs:1841-1999
graph TD
COLLECT[Collect row IDs from predicates] --> FILTER{row_id_filter.is_some?}
FILTER -->|No| CONTINUE[Continue to streaming]
FILTER -->|Yes| APPLY["filter.filter(table, row_ids)"]
APPLY --> CHECK_VISIBILITY[Check MVCC columns]
CHECK_VISIBILITY --> CREATED["created_by ≤ txn_id"]
CHECK_VISIBILITY --> DELETED["deleted_by > txn_id OR NULL"]
CREATED --> VISIBLE{Visible?}
DELETED --> VISIBLE
VISIBLE -->|Yes| KEEP[Keep row ID]
VISIBLE -->|No| DROP[Drop row ID]
KEEP --> FILTERED[Filtered row IDs]
DROP --> FILTERED
FILTERED --> CONTINUE
MVCC Integration
The executor integrates with LLKV's MVCC system through the row_id_filter option in ScanStreamOptions. After collecting row IDs through predicate evaluation, the filter determines which rows are visible to the current transaction:
For trivial filters, the executor explicitly scans the created_by MVCC column to enumerate all rows, ensuring that rows with NULL values in user columns are included when appropriate.
Sources: llkv-table/src/planner/mod.rs:1269-1323
Performance Characteristics
The table below summarizes the time complexity of different execution paths:
| Execution Path | Conditions | Row ID Collection | Data Gathering | Total |
|---|---|---|---|---|
| Single Column Direct | 1 projection, trivial/full-range filter | O(1) | O(n) streaming | O(n) |
| Full Table Stream | Trivial filter, no order | O(n) via shadow column | O(n) streaming | O(n) |
| General (indexed predicate) | Single-field predicate with index | O(log n + m) | O(m × c) | O(log n + m × c) |
| General (complex predicate) | Multi-field or computed predicate | O(n × p) | O(m × c) | O(n × p + m × c) |
Where:
- n = total rows in table
- m = matching rows after filtering
- c = number of columns in projection
- p = complexity of predicate (number of fields involved)
The executor automatically selects the most efficient path based on query characteristics, with no manual tuning required.
Sources: llkv-table/src/planner/mod.rs:1009-1530 llkv-table/src/planner/mod.rs:905-999
graph TB
subgraph "External Input"
PLAN[llkv-plan SelectPlan]
EXPR[llkv-expr Expr]
end
subgraph "Table Layer"
TP[TablePlanner]
TE[TableExecutor]
PLANNED[PlannedScan]
TP --> PLANNED
PLANNED --> TE
end
subgraph "Storage Layer"
STORE[ColumnStore]
SCAN[ScanBuilder]
GATHER[MultiGatherContext]
end
subgraph "Expression Evaluation"
NORM[normalize_predicate]
COMPILER[ProgramCompiler]
KERNELS[NumericKernels]
end
subgraph "Output"
STREAM[RowStreamBuilder]
BATCH[RecordBatch]
end
PLAN --> TP
EXPR --> TP
TP --> NORM
NORM --> COMPILER
TE --> SCAN
TE --> GATHER
TE --> KERNELS
TE --> STREAM
SCAN --> STORE
GATHER --> STORE
STREAM --> BATCH
Integration Points
The TablePlanner and TableExecutor integrate with several other LLKV subsystems:
Sources: llkv-table/src/planner/mod.rs:1-76
This architecture enables LLKV to efficiently execute table scans with complex predicates and projections while maintaining clean separation between logical planning and physical execution. The multiple optimization paths ensure that simple queries execute quickly while complex queries remain correct.