ajantha-bhat commented on PR #12629: URL: https://github.com/apache/iceberg/pull/12629#issuecomment-2867156263
@deniskuzZ: > We can work around it with removePartitionStatistics and then compute, but that would create 2 snapshots, not sure if that is a good approach. Iceberg doesn't create 2 snapshots for this. Updating table metadata with the partition stats is `PendingUpdate` not `SnapshotUpdate`. So, it just creates a new table metadata file. Not a new snapshots. So, it is a light weight operation. @pvary : Thanks for discussing this in depth. I agree that as of now one interface is enough `PartitionStatsHandler.computeAndWriteStats()` -- incrementally compute stats if previous stats available, if not full compute. In the future we can have `UpdatePartitionStatistics.removePartitionStatistics()` as @gaborkaszab suggested if user want to bulk remove stats based on the need. If stats are corrupted, as of now we do have a way to recompute by unregistering existing stats. So, I am fine to keep one interface as of now. > Do we know someone from Trino / Spark who can chime in with their requirements? I didn't see much participation directly from these community. Maybe in future we can discuss again about adding new interface to force refresh if it is needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org