Kinesis
Reading from Amazon Kinesis Data Streams as a streaming source.
Krishiv reads from Kinesis Data Streams as a streaming source. The connector is a thin wrapper over the AWS SDK Kinesis client and uses the standard Kinesis GetRecords / GetShardIterator model.
Registering a source
session.register_kinesis_source(
"orders_stream",
schema, // Arrow schema
"us-east-1",
"my-stream",
"shard-iterator-type-latest", // latest | trim-horizon | at-sequence-number | after-sequence-number
)?;
Offsets and checkpointing
Krishiv's checkpoint integration captures the Kinesis SequenceNumber per shard in the SourceOffset. On restart, the source resumes from the checkpointed sequence number, falling back to shard-iterator-type if the checkpoint is missing or the stream was re-created.
Limits and quotas
- Max
RecordsPerShardPerSecond: 1 000 (soft quota). Plan partitions accordingly. - Max record size: 1 MB.
- GetRecords returns up to 10 MB or 10 000 records per call.
Auth
Standard AWS SDK auth chain: env vars, then IMDS / IRSA / instance profile. Explicit aws.access_key_id / aws.secret_access_key options are honored but not recommended in production.
Preview: Source-only at this release. No Kinesis sink.