rdblue commented on code in PR #9125: URL: https://github.com/apache/iceberg/pull/9125#discussion_r1402521562
########## format/spec.md: ########## @@ -305,6 +305,10 @@ The source column, selected by id, must be a primitive type and cannot be contai Partition specs capture the transform from table data to partition values. This is used to transform predicates to partition predicates, in addition to transforming data values. Deriving partition predicates from column predicates on the table data is used to separate the logical queries from physical storage: the partitioning can change and the correct partition filters are always derived from column predicates. This simplifies queries because users don’t have to supply both logical predicates and partition predicates. For more information, see Scan Planning below. +Two partition specs are considered compatible with each other if they have the same number of fields +and for each corresponding field, the fields have the same source column ID, transform definition +and partition name. Writers must not create a new parition spec if there already exists a compatible partition Review Comment: I think we need to make sure that this is a bit more limited. For the purpose of deduplication, we want this "compatible" definition. But if we have two specs with different partition-field-id values, they are NOT compatible. We only want this to mean if you can use an existing one (instead of assigning new IDs) you must do so. ########## format/spec.md: ########## @@ -305,6 +305,10 @@ The source column, selected by id, must be a primitive type and cannot be contai Partition specs capture the transform from table data to partition values. This is used to transform predicates to partition predicates, in addition to transforming data values. Deriving partition predicates from column predicates on the table data is used to separate the logical queries from physical storage: the partitioning can change and the correct partition filters are always derived from column predicates. This simplifies queries because users don’t have to supply both logical predicates and partition predicates. For more information, see Scan Planning below. +Two partition specs are considered compatible with each other if they have the same number of fields Review Comment: How about "equivalent" instead of "compatible"? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org