ajantha-bhat commented on code in PR #12450:
URL: https://github.com/apache/iceberg/pull/12450#discussion_r2099439655


##########
spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/ComputePartitionStatsSparkAction.java:
##########
@@ -80,6 +88,16 @@ public Result execute() {
   }
 
   private Result doExecute() {
+    if (forceRefresh) {
+      LOG.info(
+          "Clearing the existing partition statistics for all snapshots to 
enforce force refresh");
+      UpdatePartitionStatistics clearStats = table.updatePartitionStatistics();
+      table
+          .partitionStatisticsFiles()

Review Comment:
   We don't have underlaying API to do full compute to avoid too many public 
interface. 
   
   If we clear the stats, the same API does full compute. We need to clear all 
the stats because it does incremental compute if previous stats exist. 
   
   Also, the usecase or chances of needing full refresh is very rare (only 
corruption scenario), Hence, no separate API support from core module. 
   
   more details: 
https://github.com/apache/iceberg/pull/12629#issuecomment-2880415678



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to