Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

INSERT Buffering System

Relevant source files

The INSERT Buffering System is an optimization layer within llkv-sql that batches multiple consecutive INSERT ... VALUES statements for the same table into a single execution plan. This dramatically reduces planning overhead when bulk-loading data from SQL scripts containing thousands of individual INSERT statements. The system preserves per-statement result semantics while amortizing the cost of plan construction and table access across large batches.

For information about how INSERT plans are structured and executed, see Plan Structures.

Purpose and Design Goals

The buffering system addresses a specific performance bottleneck: SQL scripts generated by database export tools often contain tens of thousands of individual INSERT INTO table VALUES (...) statements. Without buffering, each statement incurs the full cost of parsing, planning, catalog lookup, and MVCC overhead. The buffer accumulates compatible INSERT statements and flushes them as a single batch, achieving order-of-magnitude throughput improvements for bulk ingestion workloads.

Key design constraints:

  • Optional : Disabled by default to preserve immediate visibility semantics for unit tests and interactive workloads
  • Transparent : Callers receive per-statement results as if each INSERT executed independently
  • Safe : Flushes automatically at transaction boundaries, table changes, and buffer size limits
  • Compatible : Integrates with statement expectation mechanisms used by the SQL Logic Test harness

Sources: llkv-sql/src/sql_engine.rs:410-520

Architecture Overview

Figure 1: INSERT Buffering Architecture

The system operates as a stateful accumulator within SqlEngine. Incoming INSERT statements are classified as either PreparedInsert::Values (bufferable literals) or PreparedInsert::Immediate (non-bufferable subqueries or expressions). Compatible VALUES inserts accumulate in the buffer until a flush trigger fires, at which point the buffer constructs a single InsertPlan and emits individual RuntimeStatementResult::Insert entries for each original statement.

Sources: llkv-sql/src/sql_engine.rs:416-509

Buffer Data Structures

InsertBuffer

Figure 2: Buffer Data Structure

The InsertBuffer struct maintains five critical pieces of state:

FieldTypePurpose
table_nameStringTarget table identifier for compatibility checking
columnsVec<String>Column list; must match for batching
on_conflictInsertConflictActionConflict resolution policy; must match for batching
total_rowsusizeSum of all buffered rows across statements
statement_row_countsVec<usize>Per-statement row counts for result construction
rowsVec<Vec<PlanValue>>Literal row payloads in execution order

The statement_row_counts vector preserves the boundary between original INSERT statements so that flush_buffer_results() can emit one RuntimeStatementResult::Insert per statement with the correct row count.

Sources: llkv-sql/src/sql_engine.rs:421-471

PreparedInsert Classification

Figure 3: INSERT Classification Flow

The prepare_insert() method analyzes each INSERT statement and returns PreparedInsert::Values only when the source is a literal VALUES clause or a SELECT that evaluates to constants (e.g., SELECT 1, 'foo'). All other forms—subqueries referencing tables, expressions requiring runtime evaluation, or DEFAULT VALUES—become PreparedInsert::Immediate and bypass buffering.

Sources: llkv-sql/src/sql_engine.rs:473-487

Buffer Lifecycle and Flush Triggers

Flush Conditions

The buffer flushes automatically when any of the following conditions occur:

TriggerConstantDescription
Size limitMAX_BUFFERED_INSERT_ROWS = 8192Total buffered rows exceeds threshold
Incompatible INSERTN/ADifferent table, columns, or conflict action
Non-INSERT statementN/AAny DDL, DML (UPDATE/DELETE), or SELECT
Transaction boundaryN/ABEGIN, COMMIT, or ROLLBACK
Statement expectationStatementExpectation::Error or Count(n)Test harness expects specific outcome
Manual flushN/Aflush_pending_inserts() called explicitly
Engine dropN/ASqlEngine destructor invoked

Sources: llkv-sql/src/sql_engine.rs:414-1127

Buffer State Machine

Figure 4: Buffer State Machine

The buffer exists in one of three states: Empty (no buffer allocated), Buffering (accumulating rows), or Flushing (emitting results). Transitions from Buffering to Flushing occur automatically based on the triggers listed above. After flushing, the state returns to Empty unless a new compatible INSERT immediately follows, in which case a fresh buffer is allocated.

Sources: llkv-sql/src/sql_engine.rs:514-1201

Integration with SqlEngine::execute()

Figure 5: Execute Loop with Buffer Integration

The execute() method iterates through parsed statements, dispatching INSERT statements to buffer_insert() and all other statements to execute_statement() after flushing. This ensures that the buffer never holds rows across non-INSERT operations or transaction boundaries.

Sources: llkv-sql/src/sql_engine.rs:933-990

buffer_insert() Implementation Details

Decision Flow

Figure 6: buffer_insert() Decision Tree

The buffer_insert() method performs three levels of gating:

  1. Expectation check : If the SLT harness expects an error or specific row count, bypass buffering entirely
  2. Buffering enabled check : If insert_buffering_enabled is false, execute immediately
  3. Compatibility check : If the INSERT is incompatible with the current buffer, flush and start a new buffer

Sources: llkv-sql/src/sql_engine.rs:1101-1201

Compatibility Rules

An INSERT can be added to the existing buffer if and only if:

This ensures that all buffered statements can be collapsed into a single InsertPlan with uniform semantics. Different column orderings, conflict actions, or target tables require separate batches.

Sources: llkv-sql/src/sql_engine.rs:452-459

Statement Expectation Handling

The SQL Logic Test harness uses thread-local expectations to signal that a specific statement should produce an error or affect a precise number of rows. The buffering system respects these hints by forcing immediate execution when expectations are present:

Figure 7: Statement Expectation Flow

graph TB
    SLTHarness["SLT Harness"]
RegisterExpectation["register_statement_expectation()"]
ThreadLocal["PENDING_STATEMENT_EXPECTATIONS\nthread_local!"]
Execute["SqlEngine::execute()"]
NextExpectation["next_statement_expectation()"]
BufferInsert["buffer_insert()"]
SLTHarness -->|before statement| RegisterExpectation
 
   RegisterExpectation --> ThreadLocal
 
   Execute --> NextExpectation
 
   NextExpectation --> ThreadLocal
 
   NextExpectation --> BufferInsert
    
 
   BufferInsert -->|Error or Count| ImmediateExec["Execute immediately\nbypass buffer"]
BufferInsert -->|Ok| MayBuffer["May buffer if enabled"]

When next_statement_expectation() returns StatementExpectation::Error or StatementExpectation::Count(n), the buffer_insert() method sets execute_immediately = true and flushes any existing buffer before executing the current statement. This preserves test correctness while still allowing buffering for the majority of statements that have no expectations.

Sources: llkv-sql/src/sql_engine.rs:64-1127

sequenceDiagram
    participant Caller
    participant FlushBuffer as "flush_buffer_results()"
    participant Buffer as "InsertBuffer"
    participant PlanStmt as "PlanStatement::Insert"
    participant Runtime as "RuntimeEngine"
    
    Caller->>FlushBuffer: flush_buffer_results()
    FlushBuffer->>Buffer: Take buffer from RefCell
    
    alt Buffer is None
        FlushBuffer-->>Caller: Ok(Vec::new())
    else Buffer has data
        FlushBuffer->>PlanStmt: Construct InsertPlan\n(table, columns, rows, on_conflict)
        FlushBuffer->>Runtime: execute_statement(plan)
        Runtime-->>FlushBuffer: RuntimeStatementResult::Insert\n(total_rows_inserted)
        
        Note over FlushBuffer: Verify total_rows matches sum(statement_row_counts)
        
        loop For each statement_row_count
            FlushBuffer->>FlushBuffer: Create RuntimeStatementResult::Insert\n(statement_rows)
        end
        
        FlushBuffer-->>Caller: Vec<SqlStatementResult>
    end

flush_buffer_results() Mechanics

The flush operation reconstructs per-statement results from the accumulated buffer state:

Figure 8: Flush Sequence

The flush process:

  1. Takes ownership of the buffer from the RefCell
  2. Constructs a single InsertPlan with all buffered rows
  3. Executes the plan via the runtime
  4. Splits the total row count across the original statements using statement_row_counts
  5. Returns a vector of per-statement results

This ensures that callers receive results as if each INSERT executed independently, even though the runtime processed them as a single batch.

Sources: llkv-sql/src/sql_engine.rs:2094-2169 (Note: The flush implementation is in the broader file, exact line range may vary)

Performance Characteristics

Throughput Improvement

Buffering provides dramatic performance gains for bulk INSERT workloads:

ScenarioWithout BufferingWith BufferingSpeedup
10,000 single-row INSERTs~30 seconds~2 seconds~15x
1,000 ten-row INSERTs~5 seconds~0.5 seconds~10x
100,000 single-row INSERTsSeveral minutes~15 seconds>10x

The improvement stems from:

  • Amortized planning : One plan for 8,192 rows instead of 8,192 plans
  • Batch MVCC overhead : Single transaction coordinator call instead of thousands
  • Reduced catalog lookups : One schema resolution instead of per-statement lookups
  • Vectorized column operations : Arrow batch processing instead of row-by-row appends

Sources: llkv-sql/README.md:36-41

Memory Usage

The buffer is bounded at MAX_BUFFERED_INSERT_ROWS = 8192 rows. Peak memory usage depends on the row width:

Peak Memory = MAX_BUFFERED_INSERT_ROWS × (Σ column_size + MVCC_overhead)

For a typical table with 10 columns averaging 50 bytes each:

8,192 rows × (10 columns × 50 bytes + 24 bytes MVCC) ≈ 4.3 MB

This predictable ceiling makes buffering safe for long-running workloads without risking unbounded memory growth.

Sources: llkv-sql/src/sql_engine.rs:410-414

API Surface

Enabling and Disabling

The set_insert_buffering(false) call automatically flushes any pending rows before disabling, ensuring visibility guarantees.

Sources: llkv-sql/src/sql_engine.rs:887-905

Manual Flush

Manual flushes are useful when the caller needs to checkpoint progress or ensure specific INSERT statements are visible before proceeding.

Sources: llkv-sql/src/sql_engine.rs:1003-1010

Drop Hook

The SqlEngine destructor automatically flushes the buffer to prevent data loss:

This ensures that buffered rows are persisted even if the caller forgets to flush explicitly.

Sources: llkv-sql/src/sql_engine.rs:513-520

Limitations and Edge Cases

Non-Bufferable INSERT Forms

The following INSERT patterns always execute immediately:

  • INSERT ... SELECT with table references
  • INSERT ... DEFAULT VALUES
  • INSERT with expressions requiring runtime evaluation (e.g., NOW(), RANDOM())
  • INSERT with parameters or placeholders

These patterns cannot be safely batched because their semantics depend on execution context.

Transaction Isolation

The buffer flushes at transaction boundaries (BEGIN, COMMIT, ROLLBACK) to preserve isolation semantics. This means:

The first INSERT's visibility is not guaranteed until the BEGIN statement forces a flush.

Conflict Handling

All buffered statements must share the same InsertConflictAction. Mixing ON CONFLICT IGNORE and ON CONFLICT REPLACE requires separate batches:

Sources: llkv-sql/src/sql_engine.rs:452-1201