gaborkaszab commented on PR #12629:
URL: https://github.com/apache/iceberg/pull/12629#issuecomment-2862964351

   > CALL catalog_name.system.compute_partition_stats('db.sample');  -- does 
incremental compute if previous stats exist, else full compute.
   
   I like this approach.
   
   About the `full_compute => true` param, could you help what would be the 
user motivation of calling this version of the procedure? There is already one 
version that could decide between incremental or full compute, so this one 
seems unnecessary. Unless the use-case is that the user learns that some 
previous stats are broken and hence want to do a full recompute. But then if 
they know that stats are broken, I agree with Peter that they should drop stats 
and then use the other variant of the `compute_partition_stats` procedure 
without the `full_compute` param. Is there anything I miss?
   
   Following this logic, we might not want to expose 
`PartitionStatsHandler.computeAndWriteStatsFileIncremental()` as public since 
the other `computeAndWriteStats` version should be able to figure out to do 
incremental or full compute.
   Any thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to