advancedxy commented on code in PR #8502: URL: https://github.com/apache/iceberg/pull/8502#discussion_r1650705615
########## core/src/main/java/org/apache/iceberg/TableMetadataParser.java: ########## @@ -481,6 +488,13 @@ public static TableMetadata fromJson(String metadataLocation, JsonNode node) { statisticsFiles = ImmutableList.of(); } + List<PartitionStatisticsFile> partitionStatisticsFiles; + if (node.has(PARTITION_STATISTICS)) { + partitionStatisticsFiles = partitionStatsFilesFromJson(node.get(PARTITION_STATISTICS)); + } else { + partitionStatisticsFiles = ImmutableList.of(); + } + Review Comment: Hi @ajantha-bhat and @aokolnychyi, I have a question about this implementation as I'm exploring to add new fields into TableMetadata. Suppose the table `db.table`'s partition stats is updated by the new version of Iceberg via UpdatePartitionStatistics. After that, some old version of Iceberg library or the PyIceberg client produces a new commit to this table. Per my understanding, that writer will produce TableMetadata without `PARTITION_STATISTICS` since it knows nothing about `PARTITION_STATISTICS`, which effectively loses that info for the table. Do you have any solutions or ideas on how to prevent such cases? I can think of some potential ideas, such as: 1. upgrade the format_version to a new one whenever we need to add new fields to table metadata, all the old clients will be rejected by the version check then. 2. define a writer_version field, old client can read metadata produced by new client, but it will reject writers with old versions. 3. move the check to the REST catalog service? I feel it's too heavy to do a format upgrade when only adding new fields in TableMetadata. Do you have any other ideas? Really appreciate your inputs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org