ProductDocumentationExamplesBlogRoadmapGitHubGet Started
Preview

Exactly-once pipeline

Build a pipeline with exactly-once delivery using a certified source/sink/checkpoint combination.

Important: "Exactly-once" in Krishiv is a property of a specific source + sink + checkpoint combination, not a global guarantee. Use this recipe only with a certified combination; otherwise prefer at-least-once with an idempotent sink.

What "exactly-once" means here

The end-to-end delivery guarantee is the weakest guarantee supplied by the source, sink, checkpoint storage, and durability profile. Exactly-once requires all four:

ComponentRequirement
SourceSupports offset/position tracking (e.g. Kafka with group.id).
SinkTransactional or two-phase. Output is committed atomically with the checkpoint.
Checkpoint storageAtomic, fenced. Object store + etcd under distributed-durable.
CoordinatorFenced with an epoch token so stale completions are rejected.

Skeleton (Kafka → Iceberg, distributed-durable)

export KRISHIV_DURABILITY_PROFILE=distributed-durable
export KRISHIV_COORDINATOR=https://coord.internal:50051
export KRISHIV_COORDINATOR_BEARER_TOKEN=...
export KRISHIV_SHUFFLE_OBJECT_STORE_URI=s3://bucket/shuffle/
-- A transactional Iceberg sink
CREATE SINK orders_eo
TYPE ICEBERG
OPTIONS (
  'catalog.uri' = 'http://catalog:8181',
  'warehouse'   = 's3://bucket/wh',
  'commit'      = 'transactional'
);

START PIPELINE orders_raw TO orders_eo
AS SELECT * FROM orders_raw;

Verify

  • Force-kill an executor mid-pipeline; the coordinator fences it and restarts from the last committed checkpoint.
  • Replay the source; the sink should not produce duplicate committed snapshots.
  • Inspect commit metadata on the Iceberg table to confirm the snapshot lineage.

See Connectors for the certified source/sink matrix and Checkpointing for the protocol details.