shardulm94 opened a new pull request, #6327:
URL: https://github.com/apache/iceberg/pull/6327

   Closes #4604
   
   This is an alternate and arguably simpler implementation to #4599. The issue 
is that ORC read path did not support projecting nested structs which have just 
the partition columns selected. As part of the ORC read path, we drop constant 
fields from the projected schema before passing it to the ORC file reader. 
Example:
   ```
   Schema readSchemaWithoutConstantAndMetadataFields =
           TypeUtil.selectNot(
               readSchema, Sets.union(idToConstant.keySet(), 
MetadataColumns.metadataFieldIds()));
   ```
   This step also results in dropping of structs which contain just the 
partition columns as they now become empty.
   
   #4599 tries to fix this by not dropping nested struct containing partition 
columns, thus reading the partition values from the file. This PR instead takes 
a different approach by preserving empty struct when dropping constant fields. 
This allows the existing constant handling in the ORC read path to work as 
expected even for nested partition fields.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to