liurenjie1024 commented on code in PR #115: URL: https://github.com/apache/iceberg-rust/pull/115#discussion_r1421922921
########## crates/iceberg/src/spec/sort.rs: ########## @@ -88,15 +91,106 @@ impl SortOrder { pub fn is_unsorted(&self) -> bool { self.fields.is_empty() } + + /// Converts to an unbound sort order + pub fn to_unbound(&self) -> UnboundSortOrder { + UnboundSortOrder::builder() + .with_order_id(self.order_id) + .with_fields( + self.fields + .iter() + .map(|x| UnboundSortField { + source_id: x.source_id, + transform: x.transform, + direction: x.direction, + null_order: x.null_order, + }) + .collect_vec(), + ) + .build() + .unwrap() + } +} + +/// Reference to [`UnboundSortOrder`]. +pub type UnboundSortOrderRef = Arc<UnboundSortOrder>; + +/// Entry for every column that is to be sorted +#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone, TypedBuilder)] +#[serde(rename_all = "kebab-case")] +pub struct UnboundSortField { + /// A source column id from the table’s schema + pub source_id: i32, + /// A transform that is used to produce values to be sorted on from the source column. + pub transform: Transform, + /// A sort direction, that can only be either asc or desc + pub direction: SortDirection, + /// A null order that describes the order of null values when sorted. + pub null_order: NullOrder, +} + +/// Unbound sort order can be later bound to a schema. +#[derive(Debug, Serialize, Deserialize, PartialEq, Eq, Clone, Builder, Default)] +#[serde(rename_all = "kebab-case")] +#[builder(setter(prefix = "with"))] +#[builder(build_fn(skip))] +pub struct UnboundSortOrder { + /// Identifier for the SortOrder, order_id `0` is no sort order. + pub order_id: i64, + /// Details of the sort + #[builder(setter(each(name = "with_sort_field")))] + pub fields: Vec<UnboundSortField>, +} Review Comment: After checking java's implementation, I'm also wondering the necessity of a separate struct for `UnboundSortOrder`. We have `UnboundPartitionSpec` is for type safety, e.g. the `field_id` in `PartitionField` is optional in unbound state while required in bounded state. This is not the case for sort order. For sort order, the key difference of bound and unbound is the build process: it requires [compatibility check](https://github.com/apache/iceberg/blob/6504fd809199cbc88ef890cc2f5b42a2cf00e102/api/src/main/java/org/apache/iceberg/SortOrder.java#L263) for bounded, while not required for unbounded. So, I think a more appropriate way would be following: 1. We have different builder methods for bounded and unbounded. What do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org