IgorBerman opened a new pull request, #14744:
URL: https://github.com/apache/iceberg/pull/14744

   When pruning nested structures (lists, maps, structs), the PruneColumns 
visitor was incorrectly returning the original unpruned field when the 
container's field ID was in the selectedIds set, even when child fields had 
been pruned.
   
   This fix ensures that:
   1. In struct(): When a field is selected and has been pruned (field != 
originalField), use the pruned version instead of the original.
   2. In list(): Check for pruned element first before checking if elementId is 
selected, ensuring nested pruning is applied.
   3. In map(): Similarly check for pruned value before checking selected 
keys/values.
   4. Add validatePrunedField() to verify pruned fields maintain compatibility 
with original fields (same name, ID, and repetition).
   
   This enables proper column pruning for deeply nested schemas like:
   list<struct<field1, nested_list: list<struct<a, b, c, d>>>>
   
   When projecting only field1 and nested_list[].a, b, the fix ensures fields c 
and d are properly pruned from the Parquet projection schema.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to