Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

via GitHub Fri, 05 Dec 2025 13:07:15 -0800


sriharshaj commented on PR #12634:
URL: https://github.com/apache/iceberg/pull/12634#issuecomment-3618604815


   > @sriharshaj Hi, if I may ask question of understanding: I know that there 
is large code base in spark itself for column pruning, and now you are changing 
internal to iceberg functionality that is related to same area. What principle 
defines separation between two(spark vs iceberg)? How have you decided that you 
need to improve pruning of columns in iceberg itself and not in spark(e.g.)? 
ps: I'm trying to work on supporting pruning of nested fields inside arrays of 
structs and personally I'm focusing on spark codebase.
   
   Hey @IgorBerman,
   My understanding is that based on the Spark query or DF logic, Iceberg 
decides which columns need to be read from the Parquet data. So, I believe the 
change should happen at the Iceberg level. WDYT?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Parquet: Fix column pruning for deeply nested fields [iceberg]

Reply via email to