Fokko commented on code in PR #79: URL: https://github.com/apache/iceberg-rust/pull/79#discussion_r1423526180
########## crates/iceberg/src/spec/manifest_list.rs: ########## @@ -30,6 +30,9 @@ use self::{ use super::{FormatVersion, StructType}; +/// The seq number when no added files are present. +pub const UNASSIGNED_SEQ_NUMBER: i64 = -1; Review Comment: In PyIceberg we set it to `0` when it is not set (v2) or unknown (v1). It is used to effectively prune and delete files that are not relevant to the data that are being read. In PyIceberg we first do the normal query planning by applying the partition filtering and the metrics. The new end up with a list of files where we compute the minimal data file sequence number: https://github.com/apache/iceberg-python/blob/8c8abb5c4c258e32941110a9ce0938e1328290b3/pyiceberg/table/__init__.py#L1028-L1037 There is an obvious fallback to `INITIAL_SEQUENCE_NUMBER` which is `0`. If this happens then we know that we can't use this number to prune, and all the deletes files that are present will be included (because the sequence number there is also greater than or equal to zero. I would suggest setting this to zero. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org