State Overview
Keyed state, state backends, durability profiles, and the role of state in checkpoints.
Krishiv's state is per-key, typed, and persistent. It is the substrate for windows, joins, process functions, IVM, and any operator that needs to remember something about a key. This page is the conceptual overview; the API reference lives in State Types and Savepoints and Migration.
Concepts
| Concept | Definition |
|---|---|
| State | A per-key, named, typed value (or list / map / reducer / broadcast). Stored in a state backend. |
| State descriptor | The schema for a state slot: name, key encoding, value type, TTL. Created at job-submit time. |
| State backend | The store that actually persists state. Currently: InMemoryStateBackend or RocksDbStateBackend, with TtlStateBackend as an optional wrapper. |
| Checkpoint | A consistent snapshot of all state for a job, written to the configured checkpoint storage. Coordinated by the coordinator. |
| Savepoint | A user-triggered, named checkpoint. Listable, deletable, restorable. |
| Durability profile | Decides which backends are in use: dev-local, single-node-durable, distributed-durable. |
Durability profiles
Set the profile at the session level via KRISHIV_DURABILITY_PROFILE (env) or by choosing the matching Session factory. Each profile selects the right combination of state, shuffle, and checkpoint backends.
| Profile | State | Shuffle | Checkpoints | When to use |
|---|---|---|---|---|
dev-local | In-memory | In-memory | Ephemeral (in-process) | Examples, tests, dev laptops. |
single-node-durable | RocksDB (local disk) | Local disk | Local filesystem | Single-host production. Restart-durable. |
distributed-durable | RocksDB (restored from checkpoint) | Tiered: local + object store | Object store + etcd metadata | Multi-host. Fenced, fault-tolerant. |
State encoding
State values are stored as a single byte buffer per (operator, key, slot). The encoding is:
[8-byte LE expires_at_ms (optional, when TTL is enabled)][postcard-serialized value]
The TTL prefix is included only if TtlStateBackend::set_watermark(...) has been called. When a key is read and its expires_at_ms is in the past, the value is treated as absent and the entry is removed on the next write. This means TTL works correctly even if the event-time watermark lags the wall clock.
Storage URIs
Checkpoints and shuffle data are written through URI-typed backends:
| URI | Backend |
|---|---|
file:///var/krishiv/ckpt / /var/krishiv/ckpt | LocalFsCheckpointStorage |
s3://bucket/path | ObjectStoreCheckpointStorage (S3, GCS, ADLS, MinIO — anything object_store supports) |
memory:// / dev only | EphemeralCheckpointStorage |
Helper: open_checkpoint_storage_from_uri(uri) -> Arc<dyn CheckpointStorage> picks the right backend.
Migrations
When you change the type or encoding of a state value across releases, register a migration:
use krishiv_api::{register_state_migration, state_migration, apply_state_migration};
register_state_migration("my_state", 1, 2, state_migration::<OldType, NewType>(
|old| NewType { ... }
));
Migrations run automatically when a checkpoint with an older schema version is loaded. See Savepoints and Migration.
See also
- State Types —
ValueState,ListState,MapState,ReducingState,BroadcastState - Savepoints and Migration
- Queryable State
- Timers
- Execution Model — durability profiles in detail