Guosmilesmile commented on PR #12979:
URL: https://github.com/apache/iceberg/pull/12979#issuecomment-2866565123

   > Do we want to have scheduler before running the compaction? If we proceed 
on this road, then we will run the compaction after every commit. I think it 
would be better to have a `TriggerEvaluator` before the actual task, and run 
the compaction task after ever X commit, or after a given time period.
   > 
   > I understand that this would mean more configuration, but we at least 
should expose few of the scheduling possibilities to accommodate different 
use-cases
   
   @pvary Because RewriteDataFiles/Builder has scheduling related settings, and 
TableMaintenance has TriggerEvaluator built-in, such as 
`scheduleOnCommitCount`, `scheduleOnDataFileCount`, `scheduleOnInterval`, 
`scheduleOnDataFileSize`, these settings can meet the need to run the 
compaction task after every X commits or after a certain time.
   
   So I have exposed the following configs:
   
   `flink-maintenance.rewrite.schedule-on-commit-count`
   `flink-maintenance.rewrite.schedule-on-data-file-count`
   `flink-maintenance.rewrite.schedule-on-data-file-size`
   `flink-maintenance.rewrite.schedule-on-interval-second`
   
   Do these settings meet the above requirements? If my understanding is 
incorrect., please feel free to point it out. 
   
   Thank you very much.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to