c-thiel commented on code in PR #799: URL: https://github.com/apache/iceberg-rust/pull/799#discussion_r1886956770
########## crates/iceberg/src/spec/table_metadata_builder.rs: ########## @@ -524,6 +526,52 @@ impl TableMetadataBuilder { self } + /// Set statistics for a snapshot + pub fn set_statistics(mut self, statistics: StatisticsFile) -> Self { + self.metadata + .statistics + .insert(statistics.snapshot_id, statistics.clone()); Review Comment: I thought about this as well, but then followed java. Currently statistics and snapshots are quite separate from each other. If we implement your check (which I like), I think we should eventually also implement: * Upon deserialization discard statistics that belong to nonexistant snapshots * When a snapshot is removed delete the statistics for it as well This would result in snapshots for statistics not missing. It is unclear however what should happen to the puffin files in these cases. We would have coherent metadata, but probably also orphan files. Do we know why the check is not there in Java? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org