jordepic opened a new issue, #2617:
URL: https://github.com/apache/iceberg-rust/issues/2617
### Apache Iceberg Rust version
None
### Describe the bug
When a table's struct (or nested list/map) column has gained fields over
time via schema evolution, reading data files written under the older schema
fails with an Arrow cast error such as Cast error: Casting from Utf8 to
Struct(...). The record-batch transformer reconciles a file's nested children
to the table schema by position rather than by Iceberg field id, so once a
nested struct adds a field, the children no longer line up and a mismatched
cast is attempted (e.g. casting a string child into a struct slot). Files are
valid and readable by Iceberg-Java/Spark.
### To Reproduce
1. Create a table with a column s struct<a: string> (plus other columns).
2. Write a data file.
3. Evolve the schema to s struct<a: string, b: long> (add a nested field),
keeping field ids stable.
4. Read the table (the older file still has s with only a).
5. The scan errors with Cast error: Casting from Utf8 to Struct(...).
### Expected behavior
Nested struct/list/map children are reconciled to the table schema by field
id (recursively), and fields present in the table schema but absent from the
file are materialized as typed NULLs — matching Iceberg's
column-projection-by-id semantics. The read should succeed.
### Willingness to contribute
I can contribute a fix for this bug independently
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]