deniskuzZ commented on PR #12629:
URL: https://github.com/apache/iceberg/pull/12629#issuecomment-2782991989

   @pvary, i don't follow where you are proposing these API changes?
   
   `PartitionStatsHandler` already exposes the following API that does the FULL 
re-compute
   ````
   PartitionStatisticsFile computeAndWriteStatsFile(Table table);
   /**
     * Forcefully updates the partition statistics for the table. Calculates 
them from scratch 
     * and ignores previous stats.
     */
   PartitionStatisticsFile computeAndWriteStatsFile(Table table, long 
snapshotId);
   ````
   
   A new API was proposed to support an incremental strategy, and it behaved 
exactly as you mentioned until 
https://github.com/apache/iceberg/pull/12629/commits/0fe332d55a340017f759a342a839186e4a62831c#diff-9c0f73692192e616bd3d305c627b60dc04629b291e5d69fec30e6d8b9df7c287R192
   ````
   /** If there are existing stats for the table-  then find the latest one,  
and do the incremental stats calculation from there.
     * If there are no current stats, calculate them from scratch
     * /
   PartitionStatisticsFile computeAndWriteStatsFileIncremental;
   ````
   If you are proposing to implement these API in `PartitionStatsUtil`, it 
would be problematic since it doesn't have ref to data module, see 
https://github.com/apache/iceberg/pull/12629#discussion_r2013945601
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to