herbherbherb opened a new issue, #15770:
URL: https://github.com/apache/iceberg/issues/15770

   ### Feature Request / Improvement
   
   Adds a `CommitGate` plugin interface to `IcebergSink` that controls whether 
committables are emitted downstream or buffered in Flink `ListState`. This 
enables use cases like pausing commits during catalog maintenance operations 
while keeping the Flink job running.
   
   ### Query Engine
   
   Flink
   
   ### Motivation
   
   During catalog operations (schema evolution, partition evolution, table 
migration), it may be necessary to temporarily pause Iceberg commits. Without 
this capability, the only option is stopping the Flink job, which causes:
   - Data loss risk if state is not cleanly checkpointed
   - Lag accumulation during the pause window
   - Operational toil coordinating job restarts
   
   This adds a `CommitGate` that is checked in 
`IcebergWriteAggregator.prepareSnapshotPreBarrier()`. When the gate returns 
`false`, committables are serialized to `ListState` instead of being emitted. 
When the gate reopens, all buffered committables are flushed in checkpoint 
order, preserving exactly-once semantics.
   
   ### Changes
   
   - New: `CommitGate.java` -- `@FunctionalInterface` with a single method: 
`boolean isCommitAllowed(long checkpointId)`
   - Modified: `IcebergWriteAggregator` -- accepts optional gate, adds 
`ListState` for buffering, gate check + buffer/flush logic in 
`prepareSnapshotPreBarrier()`, state initialization in `initializeState()`
   - Modified: `IcebergSink.Builder` -- new `commitGate()` method, passed 
through to aggregator
   
   ### Compatibility
   
   - No behavioral change when the gate is not set (null default)
   - Buffered committables are checkpointed in `ListState`, so recovery works 
correctly
   - No changes to public API signatures of existing methods
   - Fully backward compatible
   
   ### Willingness to contribute
   
   - [x] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to