Benchmarking
TPC-H, Nexmark, distributed benchmarks, and the Python comparison drivers.
Benchmarks live in the krishiv-bench crate (publish = false). Three categories: in-process Criterion benches, distributed CLI binaries, and Python comparison drivers.
Criterion benches
cargo bench -p krishiv-bench
Available benches:
| Bench | What it measures |
|---|---|
tpch_sf10 | TPC-H Q1 at scale factor 10 (single-process). |
tpch_distributed | TPC-H Q1 in distributed mode (coordinator + executor). |
nexmark | Nexmark streaming benchmark (auction stream + 6 queries). |
Generating TPC-H / Nexmark data
The bench crate expects data already generated. The tpch_sf10 bench reads from KRISHIV_TPCH_DATA_DIR; generate with:
# Using DuckDB's TPC-H extension
duckdb -c "INSTALL tpch; LOAD tpch; CALL dbgen(sf=10); COPY lineitem TO '/tmp/tpch_sf10/lineitem.parquet' (FORMAT 'parquet');"
Reading the results
Criterion prints ns/iter, p50, p99, and a stability indicator. Look for:
- Regression vs the previous run.
- Allocation count per iter (visible in the flamegraph output).
- Whether the bench is throughput-bound or latency-bound (the <chart> HTML report shows this).