Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

TablePlanner and TableExecutor

Relevant source files

This document describes the table-level query planning and execution system in LLKV. The TablePlanner translates scan operations into optimized execution plans, while the TableExecutor implements multiple execution strategies to materialize query results efficiently. For information about the broader query execution pipeline, see Query Execution. For details on expression compilation and evaluation, see Program Compilation and Scalar Evaluation and NumericKernels.

Purpose and Scope

The TablePlanner and TableExecutor form the core of LLKV's table-level query execution. They bridge the gap between logical query plans (from llkv-plan) and physical data access (through llkv-column-map). This document covers:

  • How scan operations are planned and optimized
  • The structure of compiled execution plans (PlannedScan)
  • Multiple execution strategies and their trade-offs
  • Predicate fusion optimization
  • Integration with MVCC row filtering
  • Streaming result materialization

Architecture Overview

Sources: llkv-table/src/planner/mod.rs:580-726

TablePlanner

The TablePlanner is responsible for analyzing scan requests and constructing optimized execution plans. It does not execute queries itself but prepares all necessary metadata for the TableExecutor.

Structure

The planner holds a reference to the Table being scanned and provides a single public method: scan_stream_with_exprs.

Sources: llkv-table/src/planner/mod.rs:580-593

Planning Flow

The planning process consists of several stages:

  1. Validation : Ensures at least one projection is specified
  2. Normalization : Applies De Morgan's laws and flattens logical operators via normalize_predicate
  3. Graph Construction : Builds a PlanGraph for visualization and introspection
  4. Program Compilation : Compiles filter expressions into EvalProgram and DomainProgram bytecode

Sources: llkv-table/src/planner/mod.rs:595-637

PlanGraph Construction

The build_plan_graph method creates a directed acyclic graph (DAG) representing the logical query plan:

Node TypePurposeMetadata
TableScanEntry point for data accesstable_id, projection_count
FilterPredicate applicationpredicates (formatted expressions)
ProjectColumn selection and computationprojections, fields with types
OutputResult materializationinclude_nulls flag

Sources: llkv-table/src/planner/mod.rs:639-725

PlannedScan Structure

The PlannedScan is an intermediate representation that bridges planning and execution. It contains all information needed to execute a scan without holding runtime state.

FieldTypePurpose
projectionsVec<ScanProjection>Column and computed projections to materialize
filter_exprArc<Expr<FieldId>>Normalized filter predicate
optionsScanStreamOptions<P>MVCC filters, ordering, null handling
plan_graphPlanGraphLogical plan for introspection
programsProgramSetCompiled bytecode for predicate evaluation

Sources: llkv-table/src/planner/mod.rs:500-509

TableExecutor

The TableExecutor implements multiple execution strategies, selecting the most efficient based on query characteristics.

Structure

The executor maintains a cache of row IDs to avoid redundant scans when multiple operations target the same table.

Sources: llkv-table/src/planner/mod.rs:572-578

Execution Strategy Selection

Sources: llkv-table/src/planner/mod.rs:1009-1367

Single Column Direct Scan Fast Path

The try_single_column_direct_scan optimization applies when:

  • Exactly one projection is requested
  • include_nulls is false
  • Filter is either trivial or a full-range predicate on the projected column
  • Column type is not Utf8 or LargeUtf8 (to avoid string complexity)
graph TD
    CHECK1{projections.len == 1?} -->|No| FALLBACK1[Fallback]
 
   CHECK1 -->|Yes| CHECK2{include_nulls == false?}
CHECK2 -->|No| FALLBACK2[Fallback]
 
   CHECK2 -->|Yes| PROJ_TYPE{Projection type?}
PROJ_TYPE -->|Column| CHECK_FILTER["is_full_range_filter?"]
PROJ_TYPE -->|Computed| CHECK_COMPUTED[Single field?]
    
 
   CHECK_FILTER -->|No| FALLBACK3[Fallback]
 
   CHECK_FILTER -->|Yes| CHECK_DTYPE{dtype?}
CHECK_DTYPE -->|Utf8/LargeUtf8| FALLBACK4[Fallback]
 
   CHECK_DTYPE -->|Other| DIRECT_SCAN[SingleColumnStreamVisitor]
    
 
   CHECK_COMPUTED -->|No| FALLBACK5[Fallback]
 
   CHECK_COMPUTED -->|Yes| COMPUTE_TYPE{Passthrough or Affine?}
COMPUTE_TYPE -->|Passthrough| DIRECT_SCAN2[SingleColumnStreamVisitor]
 
   COMPUTE_TYPE -->|Affine| AFFINE_SCAN[AffineSingleColumnVisitor]
 
   COMPUTE_TYPE -->|Other| COMPUTE_SCAN[ComputedSingleColumnVisitor]
    
 
   DIRECT_SCAN --> HANDLED[StreamOutcome::Handled]
 
   DIRECT_SCAN2 --> HANDLED
 
   AFFINE_SCAN --> HANDLED
 
   COMPUTE_SCAN --> HANDLED

This path streams data directly from storage using ScanBuilder without building intermediate row ID lists or using RowStreamBuilder.

Sources: llkv-table/src/planner/mod.rs:1369-1530

Full Table Scan Streaming Fast Path

The try_stream_full_table_scan optimization applies when:

  • Filter is trivial (no predicates)
  • No ordering is required (options.order.is_none())
graph TD
    CHECK_ORDER{order.is_some?} -->|Yes| FALLBACK[Fallback]
 
   CHECK_ORDER -->|No| STREAM_START[stream_table_row_ids]
    
 
   STREAM_START --> SHADOW{Shadow column exists?}
SHADOW -->|Yes| CHUNK[Emit row_id chunks]
 
   SHADOW -->|No| FALLBACK2[Multi-column scan fallback]
    
 
   CHUNK --> MVCC_FILTER{row_id_filter?}
MVCC_FILTER -->|Yes| APPLY[filter.filter]
 
   MVCC_FILTER -->|No| BUILD
 
   APPLY --> BUILD
    
 
   BUILD[RowStreamBuilder] --> GATHER[Gather columns]
 
   GATHER --> EMIT[Emit RecordBatch]
 
   EMIT --> MORE{More chunks?}
MORE -->|Yes| CHUNK
 
   MORE -->|No| CHECK_EMPTY
    
    CHECK_EMPTY{any_emitted?} -->|No| SYNTHETIC[emit_synthetic_null_batch]
 
   CHECK_EMPTY -->|Yes| HANDLED[StreamOutcome::Handled]
 
   SYNTHETIC --> HANDLED

This path uses stream_table_row_ids to enumerate row IDs in chunks directly from the row_id shadow column, avoiding full predicate evaluation.

This optimization is particularly effective for queries like SELECT col1, col2 FROM table with no WHERE clause.

Sources: llkv-table/src/planner/mod.rs:905-999

General Execution Path

When fast paths don't apply, the executor follows a multi-stage process:

Stage 1: Projection Metadata Construction

The executor builds several data structures:

StructurePurpose
projection_evalsVec<ProjectionEval> mapping projections to evaluation strategies
unique_lfidsVec<LogicalFieldId> of columns to fetch from storage
unique_indexFxHashMap<LogicalFieldId, usize> for column position lookup
numeric_fieldsFxHashSet<FieldId> of columns needing numeric coercion
passthrough_fieldsVec<Option<FieldId>> for identity computed projections

Sources: llkv-table/src/planner/mod.rs:1033-1117

Stage 2: Row ID Collection

For trivial filters, the executor scans the MVCC created_by column to enumerate all rows (including those with NULL values in user columns). For non-trivial filters, it evaluates the compiled ProgramSet.

Sources: llkv-table/src/planner/mod.rs:1269-1327

Stage 3: Streaming Execution

The RowStreamBuilder materializes results in batches of STREAM_BATCH_ROWS (default: 1024). For each batch:

  1. Gather : Fetch column data for row IDs via MultiGatherContext
  2. Evaluate : Compute any ProjectionEval::Computed expressions
  3. Materialize : Construct RecordBatch with final schema
  4. Emit : Call user-provided callback

Sources: llkv-table/src/planner/mod.rs:1337-1365

Program Compilation

The ProgramCompiler translates filter expressions into stack-based bytecode for efficient evaluation. It produces two programs:

  • EvalProgram : Evaluates predicates and returns matching row IDs
  • DomainProgram : Computes the domain (all potentially relevant row IDs) for NOT operations

ProgramSet Structure

Bytecode Operations

OpcodeStack EffectPurpose
PushPredicate(filter)[] → [rows]Evaluate single predicate
PushCompare{left, op, right}[] → [rows]Evaluate comparison expression
PushInList{expr, list, negated}[] → [rows]Evaluate IN list
PushIsNull{expr, negated}[] → [rows]Evaluate IS NULL
PushLiteral(bool)[] → [rows]Push all rows (true) or empty (false)
FusedAnd{field_id, filters}[] → [rows]Apply multiple predicates on same field
And{child_count}[r1, r2, ...] → [r]Intersect N row ID sets
Or{child_count}[r1, r2, ...] → [r]Union N row ID sets
Not{domain}[matched] → [domain - matched]Set difference using domain program

Sources: llkv-table/src/planner/program.rs:1-200 (referenced but not in provided files)

Execution Example

Consider the filter: WHERE (age > 18 AND age < 65) OR name = 'Alice'

After normalization and compilation:

Stack Operations:
1. PushCompare(age > 18)        → [rows1]
2. PushCompare(age < 65)        → [rows1, rows2]
3. And{2}                       → [rows1 ∩ rows2]
4. PushPredicate(name = 'Alice') → [rows1 ∩ rows2, rows3]
5. Or{2}                        → [(rows1 ∩ rows2) ∪ rows3]

The collect_row_ids_for_program method executes this bytecode by maintaining a stack of row ID vectors and applying set operations.

Sources: llkv-table/src/planner/mod.rs:2376-2502

graph TD
 
   ANALYZE[Analyze filter expression] --> BUILD["Build per_field stats"]
BUILD --> STATS["FieldPredicateStats:\ntotal, contains"]
STATS --> CHECK{should_fuse?}
CHECK --> DTYPE{Data type?}
DTYPE -->|Utf8/LargeUtf8| STRING_RULE["contains ≥ 1 AND total ≥ 2"]
DTYPE -->|Other| NUMERIC_RULE["total ≥ 2"]
STRING_RULE -->|Yes| FUSE[Generate FusedAnd opcode]
 
   NUMERIC_RULE -->|Yes| FUSE
    
 
   STRING_RULE -->|No| SEPARATE[Evaluate predicates separately]
 
   NUMERIC_RULE -->|No| SEPARATE

Predicate Fusion

The PredicateFusionCache analyzes filter expressions to identify opportunities for fused predicate evaluation.

Fusion Strategy

Predicate fusion is particularly beneficial for string columns with CONTAINS operations, where multiple predicates on the same field can be evaluated in a single storage scan.

Example: WHERE name LIKE '%Smith%' AND name LIKE '%John%' can be fused into a single scan with two pattern matchers.

Sources: llkv-table/src/planner/mod.rs:517-570 llkv-table/src/planner/mod.rs:2504-2580

Projection Evaluation

The ProjectionEval enum handles two types of projections:

Column Projections

Direct column references that can be gathered from storage without computation.

Computed Projections

Expressions evaluated per-row using NumericKernels. The executor optimizes several patterns:

PatternOptimization
columnPassthrough (no computation)
a * column + bAffine transformation (vectorized)
General expressionFull expression evaluation

Sources: llkv-table/src/planner/mod.rs:482-498 llkv-table/src/planner/mod.rs:1110-1117

Row ID Collection Strategies

The executor uses different strategies for collecting matching row IDs based on predicate characteristics:

Simple Predicates

For predicates on a single field (e.g., age > 18):

Sources: llkv-table/src/planner/mod.rs:1532-1612

Comparison Expressions

For comparisons involving multiple fields (e.g., col1 + col2 > 10):

This approach minimizes wasted computation by first identifying a "domain" of potentially matching rows (intersection of rows where all referenced columns have values) before evaluating the full expression.

Sources: llkv-table/src/planner/mod.rs:1699-1775 llkv-table/src/planner/mod.rs:2214-2374

IN List Expressions

For IN list predicates (e.g., status IN ('active', 'pending')):

The IN list evaluator properly handles SQL's three-valued logic:

  • If value matches any list element: TRUE (or FALSE if negated)
  • If no match but list contains NULL: NULL (indeterminate)
  • If no match and no NULLs: FALSE (or TRUE if negated)

Sources: llkv-table/src/planner/mod.rs:2001-2044 llkv-table/src/planner/mod.rs:1841-1999

graph TD
 
   COLLECT[Collect row IDs from predicates] --> FILTER{row_id_filter.is_some?}
FILTER -->|No| CONTINUE[Continue to streaming]
 
   FILTER -->|Yes| APPLY["filter.filter(table, row_ids)"]
APPLY --> CHECK_VISIBILITY[Check MVCC columns]
 
   CHECK_VISIBILITY --> CREATED["created_by ≤ txn_id"]
CHECK_VISIBILITY --> DELETED["deleted_by > txn_id OR NULL"]
CREATED --> VISIBLE{Visible?}
DELETED --> VISIBLE
 
   VISIBLE -->|Yes| KEEP[Keep row ID]
 
   VISIBLE -->|No| DROP[Drop row ID]
    
 
   KEEP --> FILTERED[Filtered row IDs]
 
   DROP --> FILTERED
 
   FILTERED --> CONTINUE

MVCC Integration

The executor integrates with LLKV's MVCC system through the row_id_filter option in ScanStreamOptions. After collecting row IDs through predicate evaluation, the filter determines which rows are visible to the current transaction:

For trivial filters, the executor explicitly scans the created_by MVCC column to enumerate all rows, ensuring that rows with NULL values in user columns are included when appropriate.

Sources: llkv-table/src/planner/mod.rs:1269-1323

Performance Characteristics

The table below summarizes the time complexity of different execution paths:

Execution PathConditionsRow ID CollectionData GatheringTotal
Single Column Direct1 projection, trivial/full-range filterO(1)O(n) streamingO(n)
Full Table StreamTrivial filter, no orderO(n) via shadow columnO(n) streamingO(n)
General (indexed predicate)Single-field predicate with indexO(log n + m)O(m × c)O(log n + m × c)
General (complex predicate)Multi-field or computed predicateO(n × p)O(m × c)O(n × p + m × c)

Where:

  • n = total rows in table
  • m = matching rows after filtering
  • c = number of columns in projection
  • p = complexity of predicate (number of fields involved)

The executor automatically selects the most efficient path based on query characteristics, with no manual tuning required.

Sources: llkv-table/src/planner/mod.rs:1009-1530 llkv-table/src/planner/mod.rs:905-999

graph TB
    subgraph "External Input"
        PLAN[llkv-plan SelectPlan]
        EXPR[llkv-expr Expr]
    end
    
    subgraph "Table Layer"
        TP[TablePlanner]
        TE[TableExecutor]
        PLANNED[PlannedScan]
 
       TP --> PLANNED
 
       PLANNED --> TE
    end
    
    subgraph "Storage Layer"
        STORE[ColumnStore]
        SCAN[ScanBuilder]
        GATHER[MultiGatherContext]
    end
    
    subgraph "Expression Evaluation"
        NORM[normalize_predicate]
        COMPILER[ProgramCompiler]
        KERNELS[NumericKernels]
    end
    
    subgraph "Output"
        STREAM[RowStreamBuilder]
        BATCH[RecordBatch]
    end
    
 
   PLAN --> TP
 
   EXPR --> TP
    
 
   TP --> NORM
 
   NORM --> COMPILER
    
 
   TE --> SCAN
 
   TE --> GATHER
 
   TE --> KERNELS
 
   TE --> STREAM
    
 
   SCAN --> STORE
 
   GATHER --> STORE
 
   STREAM --> BATCH

Integration Points

The TablePlanner and TableExecutor integrate with several other LLKV subsystems:

Sources: llkv-table/src/planner/mod.rs:1-76


This architecture enables LLKV to efficiently execute table scans with complex predicates and projections while maintaining clean separation between logical planning and physical execution. The multiple optimization paths ensure that simple queries execute quickly while complex queries remain correct.