[I] [feature] Add all column projection logic [iceberg-python]

via GitHub Mon, 10 Feb 2025 09:32:53 -0800


kevinjqliu opened a new issue, #1636:
URL: https://github.com/apache/iceberg-python/issues/1636


   ### Feature Request / Improvement
   
   This issue tracks implementation for all column projections.
   
   
   From the [spec](https://iceberg.apache.org/spec/#column-projection), 
   Columns in Iceberg data files are selected by field id. The table schema's 
column names and order may change after a data file is written, and projection 
must be done using field ids.
   Values for field ids which are not present in a data file must be resolved 
according the following rules:
   - [x] Return the value from partition metadata if an [Identity 
Transform](https://iceberg.apache.org/spec/#partition-transforms) exists for 
the field and the partition value is present in the partition struct on 
data_file object in the manifest. This allows for metadata only migrations of 
Hive tables. (Resolved by #1401)
   - [ ] Use schema.name-mapping.default metadata to map field id to columns 
without field id as described below and use the column if it is present.
   - [ ] Return the default value if it has a defined initial-default (See 
[Default values](https://iceberg.apache.org/spec/#default-values) section for 
more details).
   - [ ] Return null in all other cases.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] [feature] Add all column projection logic [iceberg-python]

Reply via email to