danielcweeks commented on code in PR #14234: URL: https://github.com/apache/iceberg/pull/14234#discussion_r3237523722
########## format/spec.md: ########## @@ -675,7 +676,7 @@ The `data_file` struct consists of the following fields: The `partition` struct stores the tuple of partition values for each file. Its type is derived from the partition fields of the partition spec used to write the manifest file. In v2, the partition struct's field ids must match the ids from the partition spec. -The column metrics maps are used when filtering to select both data and delete files. For delete files, the metrics must store bounds and counts for all deleted rows, or must be omitted. Storing metrics for deleted rows ensures that the values can be used during job planning to find delete files that must be merged during a scan. +The v4 `content_stats` struct stores field-level metrics. Unlike the metrics maps, the type of `content_stats` is based on table metadata, like schema. Similar to the `partition` struct, the same type is used for all files tracked in a manifest. Review Comment: ```suggestion The v4 `content_stats` struct stores field-level metrics. Unlike the metrics maps, the type of values stored in the `content_stats` struct is based on the field type in the table schema. Similar to the `partition` struct, the representation is consistent for all files tracked in a manifest. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
