sriharshaj commented on PR #12634: URL: https://github.com/apache/iceberg/pull/12634#issuecomment-3618604815
> @sriharshaj Hi, if I may ask question of understanding: I know that there is large code base in spark itself for column pruning, and now you are changing internal to iceberg functionality that is related to same area. What principle defines separation between two(spark vs iceberg)? How have you decided that you need to improve pruning of columns in iceberg itself and not in spark(e.g.)? ps: I'm trying to work on supporting pruning of nested fields inside arrays of structs and personally I'm focusing on spark codebase. Hey @IgorBerman, My understanding is that based on the Spark query or DF logic, Iceberg decides which columns need to be read from the Parquet data. So, I believe the change should happen at the Iceberg level. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
