rdblue commented on code in PR #15006:
URL: https://github.com/apache/iceberg/pull/15006#discussion_r2825064995


##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -1065,9 +1062,34 @@ private List<ManifestFile> newDeleteFilesAsManifests() {
     }
 
     if (cachedNewDeleteManifests.isEmpty()) {
+      for (Map.Entry<String, List<DeleteFile>> entry : 
dvsByReferencedFile.entrySet()) {
+        if (entry.getValue().size() > 1) {
+          LOG.warn(
+              "Attempted to commit {} duplicate DVs for data file {} in table 
{}. "
+                  + "Merging duplicates, and original DVs will be orphaned.",

Review Comment:
   Will be orphaned? I think this is a bit strong of a warning and could lead 
to incorrect issue reports.
   
   It would be nice to detect whether the puffin file is actually orphaned by 
checking whether it has other DVs. It's fine to leave a Puffin file in place if 
it has live DVs from this commit. And it's fine to delete it if we see that all 
of its DVs will be merged.
   
   We probably don't need to do all that in this commit, though. I'd probably 
leave this out and just have a warning about merging, without the part about 
orphan DVs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to