c-thiel opened a new pull request, #645: URL: https://github.com/apache/iceberg-rust/pull/645
Fixes https://github.com/apache/iceberg-rust/issues/550 This PR is a result of the issue mentioned above, a [Slack Discussion]( https://apache-iceberg.slack.com/archives/C03LG1D563F/p1725878870301909) and the preparation for the new MetadataBuilder https://github.com/apache/iceberg-rust/pull/587. Lets first establish, that `PartitionSpec` should have a bound schema because: * We need to bind the schema to to `PartitonSpec` to allow internal partition id handling (see @rdblue comment in slack). * It allows for consistent handling of schema related operations, such as determining the partition type without returning a `Result`. * A PartitionSpec is only valid for the schema is had been build against, and we should not imply otherwise. If we agree on this, then we have a small Problem with `TableMetadata`: It contains historic PartitionSpecs that cannot be bound against the `current_schema`. As a result, we need a type that has the same properties as `PartitionSpec` but is not bound to a schema. I thought we could use `UnboundPartitionSpec` for this at first, but it serves a different purpose and would make the very important `field_id` Optional. To solve this, I introduce a `SchemalessPartitionSpec`. Common functions between `PartitionSpec` and `SchemalessPartitionSpec` are maid available via a common trait. While the trait is public and as such allows to develop functions that take either variant as an input, I am exposing the most important attributes (i.e. `fields`) still directly, so that the trait typically doesn't need to be imported. For reference, Java solves this by force-binding potentially non-compatible schemas to a `PartitionSpec` * `TableMetadataBuilder`: https://github.com/apache/iceberg/blob/5a2c1c9c8dd8d61e08bbad713c24e6b726eee323/core/src/main/java/org/apache/iceberg/TableMetadata.java#L734 * `TableMetadataParser`: https://github.com/apache/iceberg/blob/5a2c1c9c8dd8d61e08bbad713c24e6b726eee323/core/src/main/java/org/apache/iceberg/TableMetadataParser.java#L393 Happy to hear your opinions! This really is a bit of a fiddly topic. I don't like to introduce a new type, on the other hand using the API feels more natural to me now. This can also be seen in the tests, where we had a significant number of places where we passed `PartitionSpec` alongside its `Schema` around, just to call `partition_type` eventually. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org