Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Filter Evaluation

Loading…

Filter Evaluation

Relevant source files

Purpose and Scope

This page documents how filter expressions are evaluated against table rows to produce sets of matching row IDs. Filter evaluation is a critical component of query execution that bridges the query planning layer (see Query Planning) and the physical data retrieval operations (see Scan Execution and Optimization).

The system evaluates filters in two primary contexts:

  • Row ID collection : Applying predicates to columns to determine which rows satisfy conditions
  • MVCC filtering : Applying transaction visibility rules to determine which row versions are visible

For information about the expression AST and predicate structures used during filtering, see Expression System.


Filter Evaluation Pipeline

Filter evaluation flows from the table abstraction down through the column store to type-specific visitor implementations:

Sources: llkv-table/src/table.rs:490-496 llkv-column-map/src/store/core.rs:356-372 llkv-column-map/src/store/scan/filter.rs:209-298

flowchart TB
    TableFilter["Table::filter_row_ids()"]
CollectRowIds["collect_row_ids_for_table()"]
CompileExpr["Expression Compilation"]
EvalPred["Predicate Evaluation"]
StoreFilter["ColumnStore::filter_row_ids()"]
Dispatch["FilterDispatch::run_filter()"]
PrimitiveFilter["FilterPrimitive::run_filter()"]
Visitor["FilterVisitor<T, F>"]
DescLoad["Load ColumnDescriptor"]
ChunkLoop["For each ChunkMetadata"]
MetaPrune["Check min/max pruning"]
LoadChunk["Load chunk arrays"]
EvalChunk["Evaluate predicate"]
CollectMatches["Collect matching row IDs"]
ReturnBitmap["Return Treemap bitmap"]
TableFilter --> CollectRowIds
 
   CollectRowIds --> CompileExpr
 
   CompileExpr --> EvalPred
 
   EvalPred --> StoreFilter
    
 
   StoreFilter --> Dispatch
 
   Dispatch --> PrimitiveFilter
 
   PrimitiveFilter --> Visitor
    
 
   Visitor --> DescLoad
 
   DescLoad --> ChunkLoop
 
   ChunkLoop --> MetaPrune
 
   MetaPrune -->|Skip chunk| ChunkLoop
 
   MetaPrune -->|May match| LoadChunk
 
   LoadChunk --> EvalChunk
 
   EvalChunk --> CollectMatches
 
   CollectMatches --> ChunkLoop
    
 
   ChunkLoop -->|Done| ReturnBitmap

Table-Level Filter Entry Points

The Table struct provides the primary interface for filter evaluation at the table abstraction layer:

graph TB
    subgraph "Table Layer"
        FilterRowIds["filter_row_ids(&Expr<FieldId>)"]
CollectRowIds["collect_row_ids_for_table()"]
CompileFilter["FilterCompiler::compile()"]
end
    
    subgraph "Compilation"
        TranslateExpr["ExprTranslator::translate()"]
BuildPredicate["build_*_predicate()"]
Predicate["Predicate<T::Value>"]
end
    
    subgraph "ColumnStore Layer"
        StoreFilterRowIds["filter_row_ids<T>()"]
FilterMatchesResult["filter_matches<T, F>()"]
end
    
 
   FilterRowIds --> CollectRowIds
 
   CollectRowIds --> CompileFilter
 
   CompileFilter --> TranslateExpr
 
   TranslateExpr --> BuildPredicate
 
   BuildPredicate --> Predicate
    
 
   Predicate --> StoreFilterRowIds
 
   Predicate --> FilterMatchesResult
    
 
   StoreFilterRowIds --> |Vec<u64>| FilterRowIds
 
   FilterMatchesResult --> |FilterResult| CollectRowIds

The filter_row_ids() method at llkv-table/src/table.rs:490-496 converts an expression tree into a bitmap of matching row IDs. It delegates to collect_row_ids_for_table() which compiles the expression and evaluates predicates against the column store.

Key responsibilities:

  • Expression translation : Convert Expr<FieldId> to Expr<LogicalFieldId>
  • Predicate compilation : Transform operators into typed Predicate structures
  • MVCC integration : Filter results by transaction visibility
  • Bitmap aggregation : Combine multiple predicate results using set operations

Sources: llkv-table/src/table.rs:490-496 llkv-table/src/table.rs:851-1100


Filter Dispatch System

The filter dispatch system provides type-specific filter implementations through the FilterDispatch trait hierarchy:

Implementation Strategy:

classDiagram
    class FilterDispatch {<<trait>>\n+type Value\n+run_filter() Vec~u64~\n+run_fused() Vec~u64~}
    
    class FilterPrimitive {<<trait>>\n+type Native\n+run_filter() Vec~u64~\n+run_all() Vec~u64~\n+run_filter_with_result() FilterResult}
    
    class Utf8Filter~O~ {+run_filter() Vec~u64~\n+run_fused() Vec~u64~}
    
    class UInt64Type
    class Int64Type
    class Float64Type
    class Date32Type
    class StringTypes
    
    FilterDispatch <|-- FilterPrimitive : implements
    FilterDispatch <|-- Utf8Filter : implements
    
    FilterPrimitive <|-- UInt64Type : specializes
    FilterPrimitive <|-- Int64Type : specializes
    FilterPrimitive <|-- Float64Type : specializes
    FilterPrimitive <|-- Date32Type : specializes
    
    Utf8Filter --> StringTypes : handles

The FilterDispatch trait at llkv-column-map/src/store/scan/filter.rs:209-273 defines the interface for type-specific filtering:

Primitive types implement FilterPrimitive which provides the default FilterDispatch implementation at llkv-column-map/src/store/scan/filter.rs:275-298 This handles numeric types, booleans, and dates using the visitor pattern.

String types use the specialized Utf8Filter implementation at llkv-column-map/src/store/scan/filter.rs:307-504 which supports vectorized operations like contains and fused multi-predicate evaluation.

Sources: llkv-column-map/src/store/scan/filter.rs:209-298 llkv-column-map/src/store/scan/filter.rs:307-504


Visitor Pattern for Chunk Traversal

Filter evaluation uses the visitor pattern to traverse chunks efficiently:

The FilterVisitor<T, F> struct at llkv-column-map/src/store/scan/filter.rs:506-591 implements all visitor traits to handle different chunk formats:

  • Unsorted chunks : Processes each value individually
  • Sorted chunks : Can exploit ordering for early termination
  • With row IDs : Matches values to explicit row identifiers
  • Sorted with row IDs : Combines both optimizations

The visitor maintains internal state to build a FilterResult:

FieldTypePurpose
predicateF: FnMut(T::Native) -> boolPredicate closure to evaluate
runsVec<FilterRun>Run-length encoded matches
fallback_row_idsOption<Vec<u64>>Sparse row ID list
prev_row_idOption<u64>Last seen row ID for run detection
total_matchesusizeCount of matching rows

Sources: llkv-column-map/src/store/scan/filter.rs:506-648 llkv-column-map/src/store/scan/filter.rs:692-771


flowchart LR
    LoadDesc["Load ColumnDescriptor"]
IterChunks["Iterate ChunkMetadata"]
CheckMin["Check min_val_u64"]
CheckMax["Check max_val_u64"]
PruneDecision{{"Can prune?"}}
SkipChunk["Skip chunk"]
LoadAndEval["Load & evaluate chunk"]
LoadDesc --> IterChunks
 
   IterChunks --> CheckMin
 
   CheckMin --> CheckMax
 
   CheckMax --> PruneDecision
    
 
   PruneDecision -->|Yes: predicate range doesn't overlap| SkipChunk
 
   PruneDecision -->|No: may contain matching values| LoadAndEval
    
 
   SkipChunk --> IterChunks
 
   LoadAndEval --> IterChunks

Chunk Metadata Pruning

The filter evaluation pipeline exploits chunk metadata to skip irrelevant data:

The ChunkMetadata structure stores summary statistics for each chunk:

FieldTypePurpose
min_val_u64u64Minimum value in chunk (for numerics)
max_val_u64u64Maximum value in chunk (for numerics)
row_countu32Number of rows in chunk
chunk_pkPhysicalKeyKey for chunk data
value_order_perm_pkPhysicalKeyKey for sort permutation

Pruning logic at llkv-column-map/src/store/core.rs:679-690:

This optimization is particularly effective for:

  • Range queries : WHERE col BETWEEN x AND y
  • Equality predicates : WHERE col = value
  • Sorted data : Natural clustering improves pruning

Sources: llkv-column-map/src/store/core.rs:679-690 llkv-column-map/src/store/descriptor.rs:40-70


String Filtering with Predicate Fusion

String filtering receives special optimization through the Utf8Filter implementation which supports fused multi-predicate evaluation:

Key optimization techniques:

flowchart TB
    RunFused["Utf8Filter::run_fused()"]
SeparatePreds["Separate predicates"]
ContainsPreds["Contains predicates"]
OtherPreds["Other predicates"]
LoadChunks["Load value & row_id chunks"]
InitBitmask["Initialize BitMask\n(all bits = 1)"]
FilterNulls["AND with null bitmask"]
LoopContains["For each contains pattern"]
VectorizedContains["Arrow::contains_utf8_scalar()"]
AndMask["AND result into bitmask"]
LoopOther["For each remaining bit"]
EvalOther["Evaluate other predicates"]
CollectRowIds["Collect matching row IDs"]
RunFused --> SeparatePreds
 
   SeparatePreds --> ContainsPreds
 
   SeparatePreds --> OtherPreds
    
 
   SeparatePreds --> LoadChunks
 
   LoadChunks --> InitBitmask
 
   InitBitmask --> FilterNulls
    
 
   FilterNulls --> LoopContains
 
   LoopContains --> VectorizedContains
 
   VectorizedContains --> AndMask
 
   AndMask --> LoopContains
    
 
   LoopContains -->|Done| LoopOther
 
   LoopOther --> EvalOther
 
   EvalOther --> CollectRowIds
  1. Bitwise filtering using BitMask at llkv-column-map/src/store/scan/filter.rs:32-110:

    • Stores candidate rows as packed u64 words
    • Supports efficient AND operations
    • Tracks candidate count to short-circuit early
  2. Vectorized contains at llkv-column-map/src/store/scan/filter.rs:441-465:

    • Uses Arrow’s SIMD contains_utf8_scalar() kernel
    • Processes entire chunks without row-by-row iteration
    • Returns boolean arrays that AND into the bitmask
  3. Progressive filtering at llkv-column-map/src/store/scan/filter.rs:421-469:

    • Applies vectorized predicates first to eliminate most rows
    • Only evaluates slower per-row predicates on remaining candidates
    • Short-circuits when candidate count reaches zero

Example scenario:

The fused evaluation:

  1. Vectorizes both LIKE patterns using contains
  2. ANDs the results to get a sparse candidate set
  3. Only evaluates LENGTH() on remaining rows

Sources: llkv-column-map/src/store/scan/filter.rs:336-504 llkv-column-map/src/store/scan/filter.rs:32-110


Filter Result Encoding

Filter results use run-length encoding to efficiently represent dense matches:

The FilterResult structure at llkv-column-map/src/store/scan/filter.rs:136-183 provides two representations:

Run-length encoding (dense matches):

  • Used when matching rows are mostly sequential
  • Each FilterRun stores start_row_id and len
  • Extremely compact for range queries or sorted scans
  • Example: rows [100, 101, 102, 103, 104] → FilterRun { start: 100, len: 5 }

Sparse representation (fallback):

  • Used when matches are scattered
  • Stores explicit Vec<u64> of row IDs
  • Automatically degrades when out-of-order matches detected
  • Example: rows [100, 200, 150, 300] → [100, 200, 150, 300]

Adaptive strategy at llkv-column-map/src/store/scan/filter.rs:543-590:

The FilterVisitor::record_match() method dynamically chooses encoding:

If match follows previous (row_id == prev + 1):
    Extend current run
Else if match is out of order (row_id < prev):
    Convert to sparse representation
Else:
    Start new run

This ensures optimal encoding regardless of data distribution.

Sources: llkv-column-map/src/store/scan/filter.rs:115-183 llkv-column-map/src/store/scan/filter.rs:543-590


flowchart TB
    FilterRowIds["Table::filter_row_ids()"]
CollectUserData["Collect matching user data rows"]
MVCCCheck{{"MVCC enabled?"}}
LoadCreatedBy["Load created_by column"]
LoadDeletedBy["Load deleted_by column"]
FilterCreated["Filter: created_by <= txn_id"]
FilterDeleted["Filter: deleted_by = 0 OR\ndeleted_by > txn_id"]
IntersectSets["Intersect bitmaps"]
ReturnVisible["Return visible rows"]
FilterRowIds --> CollectUserData
 
   CollectUserData --> MVCCCheck
    
 
   MVCCCheck -->|No| ReturnVisible
 
   MVCCCheck -->|Yes| LoadCreatedBy
    
 
   LoadCreatedBy --> LoadDeletedBy
 
   LoadDeletedBy --> FilterCreated
 
   FilterCreated --> FilterDeleted
 
   FilterDeleted --> IntersectSets
 
   IntersectSets --> ReturnVisible

MVCC Filtering Integration

Filter evaluation integrates with Multi-Version Concurrency Control (MVCC) to enforce transaction visibility:

MVCC columns are stored in separate logical namespaces:

NamespaceColumnPurpose
TxnCreatedBycreated_byTransaction ID that created this row version
TxnDeletedBydeleted_byTransaction ID that deleted this row (0 if active)

Visibility rules at llkv-table/src/table.rs:1047-1095:

A row version is visible to transaction T if:

  1. created_by <= T.id (row was created before or by this transaction)
  2. deleted_by = 0 OR deleted_by > T.id (row not deleted, or deleted after this transaction)

Implementation:

The collect_row_ids_for_table() method applies MVCC filtering after user predicate evaluation:

  1. Evaluate user predicates on user-data columns → bitmap A
  2. Evaluate created_by <= txn_id on MVCC column → bitmap B
  3. Evaluate deleted_by = 0 OR deleted_by > txn_id on MVCC column → bitmap C
  4. Return intersection: A ∩ B ∩ C

This ensures only transaction-visible row versions are returned to the query executor.

Sources: llkv-table/src/table.rs:851-1100 llkv-table/src/table.rs:231-437


Performance Characteristics

Filter evaluation performance depends on several factors:

ScenarioPerformanceExplanation
Equality on indexed columnO(log N)Uses binary search in sorted chunks
Range query on sorted dataO(chunks)Metadata pruning skips most chunks
String contains (single)O(N)Vectorized SIMD comparison
String contains (multiple)~O(N)Fused evaluation with progressive filtering
Complex predicatesO(N × P)Per-row evaluation of P predicates
Dense matchesHigh efficiencyRun-length encoding reduces memory
Sparse matchesModerate overheadExplicit row ID lists

Optimization opportunities:

  1. Chunk-level parallelism : Filter evaluation at llkv-column-map/src/store/scan/filter.rs:381-494 uses Rayon for parallel chunk processing
  2. Early termination : Metadata pruning can skip 90%+ of chunks for range queries
  3. Cache efficiency : Sequential chunk traversal has good spatial locality
  4. SIMD operations : String operations use Arrow’s vectorized kernels

Sources: llkv-column-map/src/store/scan/filter.rs:381-494 llkv-column-map/src/store/core.rs:679-690


Integration with Scan Execution

Filter evaluation feeds into the scan execution pipeline (see Scan Execution and Optimization):

The two-phase execution model:

Phase 1: Filter evaluation (this page)

  • Evaluate predicates against columns
  • Produce bitmap of matching row IDs
  • Minimal data movement (only row ID metadata)

Phase 2: Data gathering (see Scan Execution and Optimization)

  • Use row ID bitmap to fetch actual column values
  • Assemble into Arrow RecordBatch
  • Apply projections and transformations

This separation enables:

  • Predicate pushdown : Filter before gathering data
  • Projection pruning : Only fetch required columns
  • Parallel execution : Filter and gather can overlap
  • Memory efficiency : Small bitmaps instead of full data

Sources: llkv-table/src/table.rs:490-496 llkv-scan/src/execute.rs:1-300

Dismiss

Refresh this wiki

Enter email to refresh