herbherbherb opened a new issue, #15770: URL: https://github.com/apache/iceberg/issues/15770
### Feature Request / Improvement Adds a `CommitGate` plugin interface to `IcebergSink` that controls whether committables are emitted downstream or buffered in Flink `ListState`. This enables use cases like pausing commits during catalog maintenance operations while keeping the Flink job running. ### Query Engine Flink ### Motivation During catalog operations (schema evolution, partition evolution, table migration), it may be necessary to temporarily pause Iceberg commits. Without this capability, the only option is stopping the Flink job, which causes: - Data loss risk if state is not cleanly checkpointed - Lag accumulation during the pause window - Operational toil coordinating job restarts This adds a `CommitGate` that is checked in `IcebergWriteAggregator.prepareSnapshotPreBarrier()`. When the gate returns `false`, committables are serialized to `ListState` instead of being emitted. When the gate reopens, all buffered committables are flushed in checkpoint order, preserving exactly-once semantics. ### Changes - New: `CommitGate.java` -- `@FunctionalInterface` with a single method: `boolean isCommitAllowed(long checkpointId)` - Modified: `IcebergWriteAggregator` -- accepts optional gate, adds `ListState` for buffering, gate check + buffer/flush logic in `prepareSnapshotPreBarrier()`, state initialization in `initializeState()` - Modified: `IcebergSink.Builder` -- new `commitGate()` method, passed through to aggregator ### Compatibility - No behavioral change when the gate is not set (null default) - Buffered committables are checkpointed in `ListState`, so recovery works correctly - No changes to public API signatures of existing methods - Fully backward compatible ### Willingness to contribute - [x] I can contribute this improvement/feature independently - [x] I would be willing to contribute this improvement/feature with guidance from the Iceberg community - [ ] I cannot contribute this improvement/feature at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
