herbherbherb opened a new issue, #15763:
URL: https://github.com/apache/iceberg/issues/15763

   ### Feature Request / Improvement
   
   Adds an `OutputFileFactoryProvider` plugin interface to `IcebergSink` that 
allows external implementations to customize how `OutputFileFactory` instances 
are created. This enables use cases like custom file naming, path rewriting, or 
alternative storage routing without subclassing internal writer factories.
   
   ### Query Engine
   
   Flink
   
   ### Motivation
   
   The current `RowDataTaskWriterFactory` creates `OutputFileFactory` 
internally with no extension point. Users who need to customize output file 
paths (e.g., for cross-region writes, custom partitioning schemes, or storage 
proxy routing) must subclass `RowDataTaskWriterFactory` and depend on its 
internal structure.
   
   This PR adds a clean plugin interface: if an `OutputFileFactoryProvider` is 
set on the builder, it is used to create the `OutputFileFactory` in 
`RowDataTaskWriterFactory.initialize()`. Otherwise the existing behavior is 
unchanged.
   
   ### Changes
   
   - New: `OutputFileFactoryProvider.java` -- `@FunctionalInterface` with a 
single method: `OutputFileFactory create(Table table, int taskId, int 
attemptId, FileFormat format, PartitionSpec spec)`
   - Modified: `RowDataTaskWriterFactory` -- accepts optional provider, uses it 
in `initialize()` when present
   - Modified: `IcebergSink.Builder` -- new `outputFileFactoryProvider()` 
method, passed through to writer factory
   
   ### Compatibility
   
   - No behavioral change when the provider is not set (null default)
   - No changes to public API signatures of existing methods
   - Fully backward compatible
   
   ### Willingness to contribute
   
   - [x] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to