c-thiel commented on code in PR #799:
URL: https://github.com/apache/iceberg-rust/pull/799#discussion_r1886956770


##########
crates/iceberg/src/spec/table_metadata_builder.rs:
##########
@@ -524,6 +526,52 @@ impl TableMetadataBuilder {
         self
     }
 
+    /// Set statistics for a snapshot
+    pub fn set_statistics(mut self, statistics: StatisticsFile) -> Self {
+        self.metadata
+            .statistics
+            .insert(statistics.snapshot_id, statistics.clone());

Review Comment:
   I thought about this as well, but then followed java.
   
   Currently statistics and snapshots are quite separate from each other. If we 
implement your check (which I like), I think we should eventually also 
implement:
   * Upon deserialization discard statistics that belong to nonexistant 
snapshots
   * When a snapshot is removed delete the statistics for it as well
   
   This would result in snapshots for statistics not missing. It is unclear 
however what should happen to the puffin files in these cases. We would have 
coherent metadata, but probably also orphan files.
   
   Do we know why the check is not there in Java?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to