pvary commented on code in PR #14358:
URL: https://github.com/apache/iceberg/pull/14358#discussion_r2439181861
##########
flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicWriteResultAggregator.java:
##########
@@ -118,6 +114,14 @@ public void prepareSnapshotPreBarrier(long checkpointId)
throws IOException {
this.lastCheckpointId = checkpointId;
+ // Pause eviction on the cache for ManifestOutputFileFactory.
+ // This will materialize a list of all output file factories required to
write delta manifests.
+ // Some of the output file factories may already be cached, and we will
reuse those. We must
+ // absolutely avoid re-creating any output file factories _during_ writing
manifests, otherwise
+ // cache eviction may reset the manifest file names for multiple
WriteResults for a given table
+ // which will overwrite already written manifest files!
+ outputFileFactories.haltEviction();
Review Comment:
I don't really like this, because this would limit our possibilities if/when
we are choosing cache implementations.
What if we just make sure that the OutputFileFactory at creating time
generates a random UUID and then adds it to the filenames, so they are not
recreated/overwritten even if the factory is recreated?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]