JonasJ-ap commented on PR #6449: URL: https://github.com/apache/iceberg/pull/6449#issuecomment-1374229027
> Regarding Delta name mapping that @findepi mentioned, looking at the spec, > > ``` > Write data files by using the physical name that is chosen for each column. The physical name of the column is static and can be different than the display name of the column, which is changeable. > > Write the 32 bit integer column identifier as part of the field_id field of the SchemaElement struct in the [Parquet Thrift specification](https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift). > > Track partition values and column level statistics with the physical name of the column in the transaction log. > ``` > > Because the column name has changed in the underlying parquet file, migrating that requires not only Iceberg name mapping configuration, but also converting the statistics retrieved from Parquet files. > > Sounds like something that can be added as the next step after this PR is merged. Also, according to [roadmap of delta lake](https://github.com/delta-io/delta/issues/1307), the `delta-standalone` currently does not support culumnMapping and other features in high protocol version. Maybe we can start adding support for these features once the new version of `delta-standalone` get published in the next few months -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
