Guosmilesmile commented on PR #12979: URL: https://github.com/apache/iceberg/pull/12979#issuecomment-2866565123
> Do we want to have scheduler before running the compaction? If we proceed on this road, then we will run the compaction after every commit. I think it would be better to have a `TriggerEvaluator` before the actual task, and run the compaction task after ever X commit, or after a given time period. > > I understand that this would mean more configuration, but we at least should expose few of the scheduling possibilities to accommodate different use-cases @pvary Because RewriteDataFiles/Builder has scheduling related settings, and TableMaintenance has TriggerEvaluator built-in, such as `scheduleOnCommitCount`, `scheduleOnDataFileCount`, `scheduleOnInterval`, `scheduleOnDataFileSize`, these settings can meet the need to run the compaction task after every X commits or after a certain time. So I have exposed the following configs: `flink-maintenance.rewrite.schedule-on-commit-count` `flink-maintenance.rewrite.schedule-on-data-file-count` `flink-maintenance.rewrite.schedule-on-data-file-size` `flink-maintenance.rewrite.schedule-on-interval-second` Do these settings meet the above requirements? If my understanding is incorrect., please feel free to point it out. Thank you very much. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org