Deploy on Kubernetes
Install the operator, declare a KrishivCluster, and verify a streaming job runs end-to-end.
The fastest path to a distributed Krishiv cluster is the krishiv-operator with a KrishivCluster CRD. The operator reconciles the CRD into a coordinator Deployment, an executor Deployment, services, and (if your cluster has it) an Ingress.
Prerequisites
- Kubernetes 1.27+
- cert-manager (for the operator's webhook — optional)
- An S3 / GCS / ADLS bucket for checkpoints and shuffle
- A container registry you can push to
Step 1 — Build and push the image
docker build -t my-registry.example.com/krishiv:v0.1.0 -f Dockerfile.fast .
docker push my-registry.example.com/krishiv:v0.1.0
The same image runs as the coordinator, the executor, and the operator. The binary name determines the role.
Step 2 — Apply CRDs and the operator
kubectl apply -f k8s/operator/krishiv-crd.yaml
kubectl apply -f k8s/operator/operator-deployment.yaml
Or with Helm (if your team has a chart):
helm install krishiv-operator ./charts/krishiv-operator --set image.repository=my-registry.example.com/krishiv --set image.tag=v0.1.0
Step 3 — Declare a KrishivCluster
apiVersion: krishiv.io/v1
kind: KrishivCluster
metadata:
name: prod
spec:
image: my-registry.example.com/krishiv:v0.1.0
coordinators: 1
executors: 4
durabilityProfile: distributed-durable
checkpointStorage:
uri: s3://my-bucket/krishiv/checkpoints/
shuffleStorage:
uri: s3://my-bucket/krishiv/shuffle/
auth:
bearerTokenSecret: krishiv-bearer-token
config:
KRISHIV_OIDC_AUDIENCE: krishiv-prod
KRISHIV_OIDC_JWKS_URI: https://auth.example.com/.well-known/jwks.json
Step 4 — Verify
kubectl get krishivcluster prod
kubectl get pods -l app=krishiv,role=coordinator
kubectl get pods -l app=krishiv,role=executor
# Port-forward and check the UI
kubectl port-forward svc/krishiv-coordinator 2002:2002
open http://localhost:2002/ui
# Run a SQL query against the cluster
kubectl port-forward svc/krishiv-coordinator 2003:2003 &
krishiv sql --remote -c grpc://localhost:2003 --query "SELECT 1"
Step 5 — Submit a job
apiVersion: krishiv.io/v1
kind: KrishivJob
metadata:
name: orders-per-minute
spec:
cluster: prod
sql: |
CREATE SOURCE orders TYPE KAFKA
OPTIONS ('brokers' = 'broker:9092', 'topic' = 'orders', 'group.id' = 'krishiv-app');
CREATE SINK per_minute TYPE ICEBERG
OPTIONS ('catalog.uri' = '...', 'warehouse' = '...', 'commit' = 'transactional', 'table' = 'orders_per_minute');
START PIPELINE orders TO per_minute AS
SELECT tumble_start(event_time, INTERVAL '1 minute') AS window_start,
customer_id, SUM(amount) AS total
FROM orders
GROUP BY tumble_start(event_time, INTERVAL '1 minute'), customer_id;
checkpoint:
location: s3://my-bucket/krishiv/ckpt/orders-per-minute/