deniskuzZ commented on PR #11669: URL: https://github.com/apache/iceberg/pull/11669#issuecomment-2505896721
@pvary, unfortunately, that won't work. I was looking for an easy way to get basic partition stats, however, I missed the part that iceberg only keeps the changed partitions in a SnapshotSummary. Aggregation with just the prev snapshot value is not enough, it requires loop through all the snapshots. ```` table.newFastAppend().appendFile(FILE_A).commit(); partitions.data_bucket=0 -> added-data-files=1,added-records=1,added-files-size=10,total-records=3,total-files-size=30,total-data-files=3,total-delete-files=0,total-position-deletes=0,total-equality-deletes=0 table.newFastAppend().appendFile(FILE_B).commit(); partitions.data_bucket=1 -> added-data-files=1,added-records=1,added-files-size=10,total-records=2,total-files-size=20,total-data-files=2,total-delete-files=0,total-position-deletes=0,total-equality-deletes=0 table.newFastAppend().appendFile(FILE_A).commit(); partitions.data_bucket=0 -> added-data-files=1,added-records=1,added-files-size=10,total-records=3,total-files-size=30,total-data-files=3,total-delete-files=0,total-position-deletes=0,total-equality-deletes=0 ```` do you think it's worth doing it in SnapshotSummary or is there some simpler/better way like create or update the partition stats puffin file right after the commit? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org