mxm commented on code in PR #14358:
URL: https://github.com/apache/iceberg/pull/14358#discussion_r2439289983


##########
flink/v2.1/flink/src/main/java/org/apache/iceberg/flink/sink/dynamic/DynamicWriteResultAggregator.java:
##########
@@ -118,6 +114,14 @@ public void prepareSnapshotPreBarrier(long checkpointId) 
throws IOException {
 
     this.lastCheckpointId = checkpointId;
 
+    // Pause eviction on the cache for ManifestOutputFileFactory.
+    // This will materialize a list of all output file factories required to 
write delta manifests.
+    // Some of the output file factories may already be cached, and we will 
reuse those. We must
+    // absolutely avoid re-creating any output file factories _during_ writing 
manifests, otherwise
+    // cache eviction may reset the manifest file names for multiple 
WriteResults for a given table
+    // which will overwrite already written manifest files!
+    outputFileFactories.haltEviction();

Review Comment:
   I had something like this in the initial version, but I was a bit hesitant 
to change the file name format due to potential other issues like hitting max 
file name limits. Also I tried adding a long-lived counter but I didn't like 
that we would have to maintain that externally or via a static counter in the 
factory class.
   
   I agree with your concerns though. A UUID probably is the simplest and 
cleanest approach.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to