Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Query Planning

Relevant source files

Purpose and Scope

Query planning is the layer that translates parsed SQL statements into strongly-typed plan structures that can be executed by the runtime engine. The llkv-plan crate defines these plan types and provides utilities for representing queries, expressions, and subquery relationships in a form that execution layers can consume without re-parsing SQL.

This page covers the plan structures themselves and how they are constructed from SQL input. For information about how expressions within plans are evaluated, see Expression System. For details on subquery correlation tracking and placeholder generation, see Subquery and Correlation Handling. For execution of these plans, see Query Execution.

Plan Structures Overview

The planning layer defines distinct plan types for each category of SQL statement. All plan types are defined in llkv-plan/src/plans.rs and flow through the PlanStatement enum for execution dispatch.

Core Plan Types

Plan TypePurposeKey Fields
SelectPlanQuery executiontables, projections, filter, joins, aggregates, order_by
InsertPlanRow insertiontable, columns, source, on_conflict
UpdatePlanRow updatestable, assignments, filter
DeletePlanRow deletiontable, filter
CreateTablePlanTable creationname, columns, source, foreign_keys
CreateIndexPlanIndex creationtable, columns, unique
CreateViewPlanView creationname, view_definition, select_plan

Sources: llkv-plan/src/plans.rs:177-256 llkv-plan/src/plans.rs:640-655 llkv-plan/src/plans.rs:662-667 llkv-plan/src/plans.rs:687-692

SelectPlan Structure

Diagram: SelectPlan Component Structure

The SelectPlan struct at llkv-plan/src/plans.rs:800-825 contains all information needed to execute a SELECT query. It separates table references, join specifications, projections, filters, aggregations, and ordering to allow execution layers to optimize each phase independently.

Sources: llkv-plan/src/plans.rs:27-67 llkv-plan/src/plans.rs:794-825

SQL-to-Plan Translation Pipeline

The translation from SQL text to plan structures occurs in SqlEngine within the llkv-sql crate. The process involves multiple stages to handle dialect differences and build strongly-typed plans.

Diagram: SQL-to-Plan Translation Flow

sequenceDiagram
    participant User
    participant SqlEngine as "SqlEngine"
    participant Preprocessor as "SQL Preprocessing"
    participant Parser as "sqlparser::Parser"
    participant Translator as "Plan Translator"
    participant Runtime as "RuntimeEngine"
    
    User->>SqlEngine: execute(sql_text)
    
    SqlEngine->>Preprocessor: preprocess_sql_input()
    Note over Preprocessor: Strip CONNECT TO\nNormalize CREATE TYPE\nFix EXCLUDE syntax\nExpand IN clauses
    Preprocessor-->>SqlEngine: processed_sql
    
    SqlEngine->>Parser: Parser::parse_sql()
    Parser-->>SqlEngine: Vec<Statement> (AST)
    
    loop "For each Statement"
        SqlEngine->>Translator: translate_statement()
        
        alt "INSERT statement"
            Translator->>Translator: translate_insert()
            Note over Translator: Parse VALUES/SELECT\nNormalize conflict action\nBuild PreparedInsert
            Translator-->>SqlEngine: PreparedInsert
            SqlEngine->>SqlEngine: buffer_insert()\nor flush immediately
        else "SELECT statement"
            Translator->>Translator: translate_select()
            Note over Translator: Build SelectPlan\nTranslate expressions\nTrack subqueries
            Translator-->>SqlEngine: SelectPlan
        else "UPDATE/DELETE"
            Translator->>Translator: translate_update()/delete()
            Translator-->>SqlEngine: UpdatePlan/DeletePlan
        else "DDL statement"
            Translator->>Translator: translate_create_table()\ncreate_index(), etc.
            Translator-->>SqlEngine: CreateTablePlan/etc.
        end
        
        SqlEngine->>Runtime: execute_statement(plan)
        Runtime-->>SqlEngine: RuntimeStatementResult
    end
    
    SqlEngine-->>User: Vec<RuntimeStatementResult>

Sources: llkv-sql/src/sql_engine.rs:933-958 llkv-sql/src/sql_engine.rs:628-757

Statement Translation Functions

The SqlEngine contains dedicated translation methods for each statement type:

sqlparser ASTTranslation MethodOutput PlanLocation
Statement::Querytranslate_select()SelectPlanllkv-sql/src/sql_engine.rs:2162-2578
Statement::Inserttranslate_insert()InsertPlanllkv-sql/src/sql_engine.rs:3194-3423
Statement::Updatetranslate_update()UpdatePlanllkv-sql/src/sql_engine.rs:3560-3704
Statement::Deletetranslate_delete()DeletePlanllkv-sql/src/sql_engine.rs:3706-3783
Statement::CreateTabletranslate_create_table()CreateTablePlanllkv-sql/src/sql_engine.rs:4081-4465
Statement::CreateIndextranslate_create_index()CreateIndexPlanllkv-sql/src/sql_engine.rs:4575-4766

Sources: llkv-sql/src/sql_engine.rs:974-1067

SELECT Translation Details

The translate_select() method at llkv-sql/src/sql_engine.rs2162 performs the following operations:

  1. Extract table references from FROM clause into Vec<TableRef>
  2. Parse join specifications into Vec<JoinMetadata> structures
  3. Translate WHERE clause to Expr<String> and discover correlated subqueries
  4. Process projections into Vec<SelectProjection> with computed expressions
  5. Handle aggregates by extracting AggregateExpr from projections and HAVING
  6. Translate GROUP BY clause to canonical column names
  7. Process ORDER BY into Vec<OrderByPlan> with sort specifications
  8. Handle compound queries (UNION/INTERSECT/EXCEPT) via CompoundSelectPlan

Sources: llkv-sql/src/sql_engine.rs:2162-2578

Expression Representation in Plans

Plans use two forms of expressions from the llkv-expr crate:

  • Expr<String>: Boolean predicates using unresolved column names (as strings)
  • ScalarExpr<String>: Scalar expressions (also with string column references)
graph LR
    SQL["SQL: WHERE age &gt; 18"]
AST["sqlparser AST\nBinaryExpr"]
ExprString["Expr&lt;String&gt;\nCompare(Column('age'), Gt, Literal(18))"]
ExprFieldId["Expr&lt;FieldId&gt;\nCompare(Column(field_7), Gt, Literal(18))"]
Bytecode["EvalProgram\nStack-based bytecode"]
SQL --> AST
 
   AST --> ExprString
 
   ExprString --> ExprFieldId
 
   ExprFieldId --> Bytecode
    
    ExprString -.stored in plan.-> SelectPlan
    ExprFieldId -.resolved at execution.-> Executor
    Bytecode -.compiled for evaluation.-> Table

These string-based expressions are later resolved to Expr<FieldId> and ScalarExpr<FieldId> during execution when the catalog provides field mappings. This two-stage approach separates planning from schema resolution.

Diagram: Expression Evolution Through Planning and Execution

The translation from SQL expressions to Expr<String> happens in llkv-sql/src/sql_engine.rs:1647-1947 The resolution to Expr<FieldId> occurs in the executor's translate_predicate() function at llkv-executor/src/translation/predicate.rs

Sources: llkv-expr/src/expr.rs llkv-sql/src/sql_engine.rs:1647-1947 llkv-plan/src/plans.rs:28-34

Join Planning

Join specifications are represented in two components:

JoinMetadata Structure

The JoinMetadata struct at llkv-plan/src/plans.rs:781-792 captures a single join between consecutive tables:

  • left_table_index : Index into SelectPlan.tables vector for the left table
  • join_type : One of Inner, Left, Right, or Full
  • on_condition : Optional ON clause filter expression

JoinPlan Types

The JoinPlan enum at llkv-plan/src/plans.rs:763-773 defines supported join semantics:

Diagram: JoinPlan Variants

The executor converts JoinPlan to llkv_join::JoinType during execution. When SelectPlan.joins is empty but multiple tables exist, the executor performs a Cartesian product (cross join).

Sources: llkv-plan/src/plans.rs:758-792 llkv-executor/src/lib.rs:542-554

Aggregation Planning

Aggregates are represented through the AggregateExpr structure defined at llkv-plan/src/plans.rs:1025-1102:

Aggregate Function Types

Diagram: AggregateFunction Variants

GROUP BY Handling

When a SELECT contains a GROUP BY clause:

  1. Column names from GROUP BY are stored in SelectPlan.group_by as canonical strings
  2. Aggregate expressions are collected in SelectPlan.aggregates
  3. Non-aggregate projections must reference GROUP BY columns
  4. HAVING clause (if present) is stored in SelectPlan.having as Expr<String>

The executor groups rows based on group_by columns, evaluates aggregates within each group, and applies the HAVING filter to group results.

Sources: llkv-plan/src/plans.rs:1025-1102 llkv-executor/src/lib.rs:1185-1597

Subquery Representation

Subqueries appear in two contexts within plans:

Filter Subqueries

FilterSubquery at llkv-plan/src/plans.rs:36-45 represents correlated subqueries used in WHERE/HAVING predicates via Expr::Exists:

  • id : Unique identifier matching Expr::Exists(SubqueryId)
  • plan : Nested SelectPlan for the subquery
  • correlated_columns : Mappings from placeholder names to outer query columns

Scalar Subqueries

ScalarSubquery at llkv-plan/src/plans.rs:48-56 represents subqueries that produce single values in projections via ScalarExpr::ScalarSubquery:

Correlated Column Tracking

The CorrelatedColumn struct at llkv-plan/src/plans.rs:59-67 describes how outer columns are bound into inner subqueries:

During execution, the executor substitutes placeholder references with actual values from the outer query's current row.

Sources: llkv-plan/src/plans.rs:36-67 llkv-sql/src/sql_engine.rs:1980-2124

Plan Value Types

The PlanValue enum at llkv-plan/src/plans.rs:73-83 represents literal values within plans:

These values appear in:

  • INSERT literal rows (InsertPlan with InsertSource::Rows)
  • UPDATE assignments (AssignmentValue::Literal)
  • Computed constant expressions

The executor converts PlanValue instances to Arrow arrays via plan_values_to_arrow_array() at llkv-executor/src/lib.rs:302-410

Sources: llkv-plan/src/plans.rs:73-161 llkv-executor/src/lib.rs:302-410

Plan Execution Interface

Plans flow to the runtime through the PlanStatement enum:

Diagram: Plan Execution Dispatch Flow

The RuntimeEngine::execute_statement() method dispatches each plan variant to the appropriate handler:

  • SELECT : Passed to QueryExecutor for streaming execution
  • INSERT/UPDATE/DELETE : Applied via Table with MVCC tracking
  • DDL : Processed by CatalogManager to modify schema metadata

Sources: llkv-runtime/src/statements.rs llkv-sql/src/sql_engine.rs:587-609 llkv-executor/src/lib.rs:523-569

Compound Query Planning

Set operations (UNION, INTERSECT, EXCEPT) are represented through CompoundSelectPlan at llkv-plan/src/plans.rs:969-996:

  • CompoundOperator : Union, Intersect, or Except
  • CompoundQuantifier : Distinct (deduplicate) or All (keep duplicates)

The executor evaluates the initial plan, then applies each operation sequentially, combining results according to set semantics. Deduplication for DISTINCT quantifiers uses hash-based row encoding.

Sources: llkv-plan/src/plans.rs:946-996 llkv-executor/src/lib.rs:590-686