syun64 commented on PR #506: URL: https://github.com/apache/iceberg-python/pull/506#issuecomment-1991884158
Updates from offline discussions: 1. The task of creating the correct Iceberg Table Schema with the desired Partition Spec, from an external table (like Hive) is out of scope of this PR. Atomically creating a table and adding files will be supported through the combination of this PR, and [CreateTableTransaction (WIP)](https://github.com/apache/iceberg-python/pull/498) 2. We will replace file_path based partition inference with parquet metadata footer based partition inference. Currently we only support IdentityPartitions, and we can infer the partition values from the metadata footer's statistics. (upper and lower bounds should be equal). This will also allow us to create extend partition inference to numeric Transforms (YearTransform, etc) by applying the transforms on the lower and upper bounds. 3. Overwrites are acknowledged as a valid modes of adding files. This is out of scope of this PR, and it can be supported atomically by deleting Expression values + adding files all within the same transaction block -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org