Iceberg Schema Evolution
Add, drop, rename, and reorder columns in an Iceberg table without rewriting data.
Iceberg supports several schema-evolution operations natively — no data rewrite required. Krishiv's planner detects the change, rewrites the SQL accordingly, and the Iceberg connector commits a new metadata file.
Supported operations
| Operation | SQL form | Iceberg behavior |
|---|---|---|
| Add column | ALTER TABLE orders ADD COLUMN region VARCHAR | New metadata file; readers fill the new column with NULL. |
| Drop column | ALTER TABLE orders DROP COLUMN legacy_col | New metadata file; readers return NULL for the dropped column. |
| Rename column | ALTER TABLE orders RENAME COLUMN amt TO amount | Metadata-only change. |
| Reorder columns | ALTER TABLE orders ALTER COLUMN region AFTER customer_id | Metadata-only change. |
| Widen type | ALTER TABLE orders ALTER COLUMN amount TYPE DOUBLE (from FLOAT) | Metadata-only change if the new type is a superset. |
| Promote to required | (SET REQUIRED) | Requires backfill if any existing row is NULL. |
Add a column
ALTER TABLE orders ADD COLUMN region VARCHAR;
-- Existing queries continue to work: new column is NULL.
SELECT * FROM orders LIMIT 1; -- region is NULL
-- Update existing rows (in a maintenance job)
UPDATE orders SET region = 'us-east' WHERE user_id IN (...);
For backfill at scale, write a streaming job that derives the new column and merges it back.
Rename a column
ALTER TABLE orders RENAME COLUMN amt TO amount;
Existing queries that reference amt will break — fix them in the same deploy. Or add a temporary compatibility view:
CREATE OR REPLACE VIEW orders_legacy AS SELECT order_id, user_id, amt AS amount, status, ts FROM orders;
Widen a type
-- FLOAT to DOUBLE is metadata-only.
ALTER TABLE orders ALTER COLUMN amount TYPE DOUBLE;
-- INT to BIGINT is metadata-only too.
ALTER TABLE orders ALTER COLUMN user_id TYPE BIGINT;
-- Narrowing (DOUBLE to FLOAT) is NOT supported by Iceberg and fails.
Branches and tags (zero-copy experiments)
For risky changes, branch first:
-- Create a branch at the current main snapshot
CALL system.create_branch('orders', 'experiment', main_ref);
-- Run your migration on the branch
ALTER TABLE orders_branch ADD COLUMN region VARCHAR;
-- (or do it in Python via the Iceberg REST API)
-- Inspect the diff via time-travel
SELECT * FROM orders FOR SYSTEM_TIME AS OF 'experiment';
-- Promote or drop
CALL system.fast_forward('orders', 'experiment', main_ref);
-- or
CALL system.drop_branch('orders', 'experiment');
Compatibility modes
| Mode | Description |
|---|---|
backward (default) | Old readers can read new data. Add/drop/widen are OK; rename breaks old readers. |
forward | New readers can read old data. Drop/widen are OK; add breaks new readers. |
full | Both. Most restrictive. |
none | No compatibility checks. Use only for development. |