viirya opened a new pull request, #2635:
URL: https://github.com/apache/iceberg-rust/pull/2635
## Which issue does this PR close?
- Closes #2618.
## What changes are included in this PR?
When a column of a nested type — `list`, `map`, or a struct that itself
contains nested children — is added to the table schema after data files were
written, reading those older files fails with `unexpected target column type
List(...)`. The transformer correctly plans a `ColumnSource::Add { value: None,
.. }` for the missing column, but the helpers that materialize the all-NULL
array (`create_primitive_array_repeated` and
`create_primitive_array_single_element` in `arrow/value.rs`) only covered
primitive types plus structs with primitive-only children, each via a
hand-written per-type NULL branch.
This PR replaces all of those NULL branches with a single early return using
arrow's `new_null_array`, which constructs a typed all-NULL array for **every**
Arrow type, including arbitrarily nested ones (the timezone of timestamps and
precision/scale of decimals are part of the `DataType`, so they are preserved).
The `Some(literal)` branches — used for `initial_default` values and partition
constants — are unchanged. Net effect: the two functions shrink by ~180 lines
and the unsupported-type failure mode for NULL filling disappears entirely.
## Are these changes tested?
New regression test
`schema_evolution_adds_list_map_and_nested_struct_columns_with_nulls` in
`record_batch_transformer.rs`: a file batch containing only `id` is read
against an evolved schema that added `xs: list<int>`, `props: map<string,
int>`, and `s: struct<a: string, ys: list<long>>` (the struct's `ys` child also
exercises the nested-children path that the old `Struct` branch couldn't
handle). The test asserts the added columns come back with the evolved schema's
Arrow types and `null_count == num_rows`.
The test fails on main with `unexpected target column type List(Int32, ...)`
— the exact error from the issue — and passes with this change. Full `iceberg`
lib suite (1313 tests) passes; clippy and rustfmt clean.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]