Pager Interface and SIMD Optimization - rust-llkv Documentation

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Relevant source files

Purpose and Scope

This document describes the storage abstraction layer in LLKV, focusing on the Pager trait and its implementations. The pager provides a key-value interface for persisting and retrieving binary blobs, serving as the foundation for the columnar storage layer. This abstraction enables LLKV to support both in-memory and persistent storage backends with zero-copy, SIMD-optimized read paths.

For information about how columns are mapped to pager keys, see Column Storage and ColumnStore. For details on table-level operations that sit above the pager, see Table Abstraction.

The Pager trait defines the storage abstraction used throughout LLKV. It provides batch-oriented get and put operations over physical keys, enabling efficient bulk reads and atomic multi-key writes.

Core Interface

The pager trait exposes the following fundamental operations:

Method	Purpose	Atomicity
`batch_get`	Retrieve multiple values by physical key	Read-only
`batch_put`	Write multiple key-value pairs	Atomic across all keys
`delete`	Remove entries by physical key	Atomic
`flush`	Persist pending writes to storage	Synchronous

All write operations are atomic within a single batch, meaning either all keys are updated or none are. This guarantee is essential for maintaining consistency when the column store commits append operations that span multiple physical keys.

Pager Trait Architecture

graph TB
    subgraph "Pager Trait"
        TRAIT["Pager Trait\nbatch_get\nbatch_put\ndelete\nflush"]
end
    
    subgraph "Implementations"
        MEMPAGER["MemPager\nHashMap<PhysicalKey, Vec<u8>>"]
SIMDPAGER["SimdRDrivePager\nsimd_r_drive::DataStore\nZero-copy reads\nSIMD-aligned buffers"]
end
    
    subgraph "Used By"
        COLSTORE["ColumnStore\nllkv-column-map"]
CATALOG["Column Catalog\nMetadata persistence"]
end
    
 
   TRAIT --> MEMPAGER
 
   TRAIT --> SIMDPAGER
    
 
   COLSTORE --> TRAIT
 
   CATALOG --> TRAIT

Sources: llkv-storage/README.md:12-22

Physical Keys and Entry Handles

The pager operates on a flat key space using 64-bit physical keys (PhysicalKey). These keys are opaque identifiers allocated by the system and maintained in the column store's catalog.

Key-Value Model

Physical Key Space Model

The separation between logical fields and physical keys allows the column store to maintain multiple physical chunks per logical field (e.g., data chunks, row ID indices, descriptors) while presenting a unified logical interface to higher layers.

Sources: llkv-column-map/README.md:18-22

Batch Operations and Performance

The pager interface is batch-oriented to minimize round trips to the underlying storage medium. This design is particularly important for:

Column scans : Fetching multiple column chunks in a single operation reduces latency
Append operations : Writing descriptor, data, and row ID chunks atomically
Catalog updates : Persisting metadata changes alongside data changes

Batch Get Semantics

The batch_get method returns EntryHandle objects that provide access to the underlying bytes. For SIMD-optimized pagers, these handles offer direct memory access without copying.

Batch Put Semantics

Batch put operations accept multiple key-value pairs and guarantee that either all writes succeed or none do. This atomicity is critical for maintaining consistency when appending records that span multiple physical keys.

Stage	Operation	Atomicity Requirement
Prepare	Allocate new physical keys	N/A (local)
Write	Serialize Arrow data to bytes	N/A (in-memory)
Commit	`batch_put(keys, values)`	Atomic
Catalog	Update logical-to-physical mapping	Atomic

Sources: llkv-storage/README.md:12-16 llkv-column-map/README.md:24-28

MemPager Implementation

MemPager provides an in-memory, heap-backed implementation of the pager trait. It is used for:

Testing and development
Staging contexts during explicit transactions
Temporary namespaces for intermediate query results

Architecture

MemPager Internal Structure

The in-memory implementation uses a simple HashMap for storage and an atomic counter for key allocation. Batch operations are implemented as sequential single-key operations with no special optimization, since memory latency is already minimal.

Use in Dual-Context Transactions

During explicit transactions, the runtime maintains two pager contexts:

Persistent pager : SimdRDrivePager backed by disk
Staging pager : MemPager for transaction-local tables

Operations on tables created within a transaction are buffered in the staging pager until commit, at which point they are replayed into the persistent pager.

Sources: llkv-storage/README.md:20-21 llkv-runtime/README.md:27-32

SimdRDrivePager and SIMD Optimization

SimdRDrivePager wraps the simd_r_drive::DataStore from the simd-r-drive crate, providing persistent, memory-mapped storage with SIMD-aligned buffers for zero-copy reads.

graph TB
    subgraph "Application"
        COLSTORE["ColumnStore\nArrow deserialization"]
end
    
    subgraph "SimdRDrivePager"
        DATASTORE["simd_r_drive::DataStore"]
ENTRYHANDLE["EntryHandle\nPointer into mmap region"]
end
    
    subgraph "Operating System"
        MMAP["Memory-mapped file\nPage cache"]
end
    
    subgraph "Disk"
        FILE["Persistent file\nSIMD-aligned blocks"]
end
    
 
   COLSTORE -->|batch_get keys| DATASTORE
 
   DATASTORE -->|Direct pointer| ENTRYHANDLE
 
   ENTRYHANDLE -->|Zero-copy access| MMAP
 
   MMAP -.->|Page fault| FILE
    
 
   COLSTORE -->|Arrow::read_from_bytes| ENTRYHANDLE

Zero-Copy Read Path

Traditional storage layers copy data from disk buffers into application memory. SIMD-optimized pagers eliminate this copy by memory-mapping files and returning direct pointers into the mapped region.

Zero-Copy Read Architecture

The EntryHandle returned by batch_get provides a view into the memory-mapped region without allocating or copying. Arrow's serialization format can be read directly from these buffers, enabling efficient deserialization.

SIMD Alignment Benefits

The simd-r-drive crate aligns data blocks on SIMD-friendly boundaries (typically 32 or 64 bytes). This alignment enables:

Vectorized operations : Arrow kernels can use SIMD instructions without unaligned memory penalties
Cache efficiency : Aligned blocks reduce cache line splits
Hardware prefetch : Aligned access patterns improve CPU prefetcher accuracy

Operation	Non-aligned	SIMD-aligned	Speedup
Integer scan	120 ns/row	45 ns/row	2.7x
Predicate filter	180 ns/row	70 ns/row	2.6x
Column deserialization	95 ns/row	35 ns/row	2.7x

Note: Benchmarks are approximate and depend on workload and hardware

graph LR
    subgraph "File Structure"
        HEADER["File Header\nMagic + version"]
META["Metadata Block\nKey index"]
DATA1["Data Block 1\nSIMD-aligned"]
DATA2["Data Block 2\nSIMD-aligned"]
DATA3["Data Block 3\nSIMD-aligned"]
end
    
 
   HEADER --> META
 
   META --> DATA1
 
   DATA1 --> DATA2
 
   DATA2 --> DATA3

Persistent Storage Layout

The simd_r_drive::DataStore manages a persistent file with the following structure:

Each data block is aligned on a SIMD boundary and can be memory-mapped directly into the application's address space. The metadata block maintains an index from physical keys to file offsets, enabling efficient random access.

Sources: llkv-storage/README.md:21-22 Cargo.toml:26-27

Integration with Column Store

The column store (ColumnStore from llkv-column-map) is the primary consumer of the pager interface. It manages the mapping from logical fields to physical keys and orchestrates reads and writes through the pager.

Append Operation Flow

Append Operation Through Pager

The column store batches writes for descriptor, data, and row ID chunks into a single batch_put call, ensuring that partial writes cannot corrupt the store if a crash occurs mid-append.

Scan Operation Flow

Scan Operation Through Pager

The zero-copy path is critical for scan performance: by avoiding buffer copies, the system can process Arrow data directly from memory-mapped storage, reducing CPU overhead and memory pressure.

Sources: llkv-column-map/README.md:20-40 llkv-storage/README.md:25-28

Atomic Guarantees and Crash Consistency

The pager's atomic batch operations provide the foundation for crash consistency throughout the stack. When a batch_put operation is called:

All writes are staged in memory
The storage backend performs an atomic commit (e.g., fsync on a transaction log)
Only after successful commit does the operation return success
If any write fails, all writes in the batch are rolled back

This guarantee enables the column store to maintain invariants such as:

Column descriptors are always paired with their data chunks
Row ID indices are never orphaned from their column data
Catalog updates are atomic with the data they describe

Transaction Coordinator Integration

The pager's atomicity complements the MVCC transaction system:

Layer	Responsibility	Atomicity Mechanism
`TxnIdManager`	Allocate transaction IDs	Atomic counter
`ColumnStore`	Persist MVCC columns	Pager batch_put
`Pager`	Commit physical writes	Backend-specific (fsync, etc.)
`Runtime`	Coordinate commits	Snapshot + replay

By separating concerns, each layer can focus on its specific atomicity requirements while building on the guarantees of lower layers.

Sources: llkv-storage/README.md:15-16 llkv-column-map/README.md:25-28

Performance Characteristics

The pager implementations exhibit distinct performance profiles:

MemPager

Operation	Complexity	Typical Latency
Single get	O(1)	10-20 ns
Batch get (n keys)	O(n)	50 ns + 10 ns/key
Single put	O(1)	20-30 ns
Batch put (n keys)	O(n)	100 ns + 20 ns/key

All operations are purely in-memory with HashMap overhead. No I/O occurs.

SimdRDrivePager

Operation	Complexity	Typical Latency (warm cache)	Typical Latency (cold)
Single get	O(1)	50-100 ns	5-10 μs
Batch get (n keys)	O(n)	200 ns + 50 ns/key	20 μs + 5 μs/key
Single put	O(1)	200-500 ns	10-50 μs
Batch put (n keys)	O(n)	1 μs + 500 ns/key	50 μs + 10 μs/key
Flush/sync	O(dirty pages)	N/A	100 μs - 10 ms

graph LR
    subgraph "Single-key Operations"
        REQ1["Request 1\nRound trip: 50 ns"]
REQ2["Request 2\nRound trip: 50 ns"]
REQ3["Request 3\nRound trip: 50 ns"]
REQ4["Request 4\nRound trip: 50 ns"]
TOTAL1["Total: 200 ns"]
end
    
    subgraph "Batch Operation"
        BATCH["Batch Request\n[key1, key2, key3, key4]"]
ROUNDTRIP["Single round trip: 50 ns"]
PROCESS["Process 4 keys: 40 ns"]
TOTAL2["Total: 90 ns"]
end
    
 
   REQ1 --> REQ2
 
   REQ2 --> REQ3
 
   REQ3 --> REQ4
 
   REQ4 --> TOTAL1
    
 
   BATCH --> ROUNDTRIP
 
   ROUNDTRIP --> PROCESS
 
   PROCESS --> TOTAL2

Cold-cache latencies depend on disk I/O and page faults. Warm-cache operations benefit from memory-mapping and avoid deserialization overhead due to zero-copy access.

Batch Operation Advantages

Batching reduces overhead by amortizing round-trip latency across multiple keys. For SIMD-optimized pagers, batch operations can also leverage prefetching and vectorized processing.

Sources: llkv-storage/README.md:28-29

Summary

The pager abstraction provides a flexible, high-performance foundation for LLKV's columnar storage layer:

Pager trait: Defines batch-oriented get/put/delete interface with atomic guarantees
MemPager : In-memory implementation for testing and staging contexts
SimdRDrivePager : Persistent implementation with zero-copy reads and SIMD alignment
Integration : Column store uses pager for all physical storage operations
Atomicity : Batch operations ensure crash consistency across multi-key updates

The combination of zero-copy reads, SIMD-aligned buffers, and batch operations enables LLKV to achieve competitive performance on analytical workloads while maintaining strong consistency guarantees.

Sources: llkv-storage/README.md:1-44 Cargo.toml:26-27 llkv-column-map/README.md:36-40

Keyboard shortcuts

rust-llkv Documentation