bryanck opened a new pull request, #8555: URL: https://github.com/apache/iceberg/pull/8555
Currently, when a Flink sink job is started, write operators are serialized with an initial table instance and that instance is used for the lifetime of the job. There are cases where a table needs to be reloaded at runtime by the subtasks. For example, if a REST catalog returns expiring credentials in the table load config response, those credentials cannot later be refreshed and will eventually expire. Also, if a task restarts, the original credentials created with the job will be used and they may have long expired. Longer term, the ability to reload a table could be used to support schema evolution, though that is not addressed in this PR. The table schema and partition spec from the initial table instance are still used for now here. This PR updates the initialization of the Flink sink so that a table supplier is used in place of a table instance. This supplier can implement reload/refresh logic as needed. The initial supplier implementation created here simply reloads the table at a given interval from the Iceberg catalog. This initial supplier implementation puts additional load on the Iceberg catalog, so by default the reload option is turned off. Different table suppliers could be implemented in the future that use a centralized mechanism for obtaining refreshed table instances to cut down on the extra catalog load. Different options were explored, such as using a broadcast or the token delegation manager, but each had limitations or complexities, so it was felt that a better first step would be to introduce the abstraction to allow for table reload with a simple initial implementation of that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
