ProductDocumentationExamplesBlogRoadmapGitHubGet Started
Preview

Avro

Reading and writing Avro files with optional Confluent schema registry.

Avro is supported as a file format. The schema is read from the file header; if you have a Confluent schema registry, the connector can fetch the writer's schema by id.

Reading

CREATE EXTERNAL TABLE events
STORED AS AVRO
LOCATION '/var/data/events/';

Or:

let df = session.read_avro("/var/data/events/").await?;

Writing

df.write_avro("/var/data/events_out/").await?;

Writer options: snappy compression (default), uncompressed, deflate.

Schema registry integration

Set KRISHIV_AVRO_REGISTRY_URL=http://schema-registry:8081. The connector will look up the latest schema by name when reading, and register the writer's schema on write (with subject naming {topic}-value or {topic}-key by default; override with the subject option).

Performance

Avro uses apache_avro for parsing. Decoding is dominated by string and bytes columns. For very wide schemas, enable project_columns in the read options to project before deserialization.

Preview: The Avro codec is feature-complete. Schema-registry integration is in the certification suite.

See also