Observability Overview
Metrics, logs, traces, and health endpoints — wired in via the krishiv_metrics crate.
Every Krishiv process — CLI, daemon, executor, even a Python session — initializes the same observability stack on startup. This page is the index; details live in the sub-pages.
Initialization
Called from krishiv/main.rs and every daemon entry point:
use krishiv_metrics::{init, MetricsConfig, TracerExporter};
let cfg = MetricsConfig {
service_name: "krishiv-coordinator".into(),
otlp_endpoint: std::env::var("OTEL_EXPORTER_OTLP_ENDPOINT").ok(),
log_filter: "info,krishiv_scheduler=debug".into(),
exporter: TracerExporter::Otlp,
..Default::default()
};
let _handle = init(cfg)?;
Defaults: service_name = "krishiv", no OTLP endpoint, log filter is RUST_LOG or info.
Three pillars
| Pillar | Where it lives | Where you read it |
|---|---|---|
| Metrics | In-process KrishivMetrics singleton; rendered as Prometheus text. | GET /metrics, Grafana, OTLP collector |
| Logs | tracing-subscriber with env-filter and FMT or JSON output. | stdout / journald / your log shipper |
| Traces | OpenTelemetry via tracing-opentelemetry; W3C trace-context over gRPC metadata. | OTLP collector, Jaeger, Tempo |
Public surface
krishiv_metrics::MetricsConfig/TracerExporter::{Otlp, Stdout, InMemory}krishiv_metrics::init(config) -> Result<MetricsHandle>krishiv_metrics::current_traceparent() -> Option<String>/current_tracestate()krishiv_metrics::global_metrics() -> &'static KrishivMetricskrishiv_metrics::render_prometheus() -> Stringkrishiv_metrics::system_metrics() -> &'static SystemMetrics- Typed reports:
ObservabilityReportand itsReportJob,ReportTask,ReportExecutor,ReportCheckpoint,ReportShuffle,ReportStreamingStatesub-types — for programmatic status surfaces.
Environment variables
| Variable | Effect |
|---|---|
OTEL_EXPORTER_OTLP_ENDPOINT | Enables OTLP trace export to this endpoint. |
KRISHIV_PRODUCTION | Set to anything truthy to fail-closed on unsafe overrides (alpha API, anonymous HTTP, manual Kafka commit). |
KRISHIV_METRICS_PORT | Exposes a dedicated /metrics HTTP endpoint on this port in addition to the one on the coordinator HTTP server. |
RUST_LOG | Log filter, e.g. info,krishiv_scheduler=debug,sqlx=warn. |