amogh-jahagirdar commented on code in PR #15006:
URL: https://github.com/apache/iceberg/pull/15006#discussion_r2834403086


##########
core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java:
##########
@@ -1076,12 +1068,52 @@ private List<ManifestFile> newDeleteFilesAsManifests() {
       // this triggers a rewrite of all delete manifests even if there is only 
one new delete file
       // if there is a relevant use case in the future, the behavior can be 
optimized
       cachedNewDeleteManifests.clear();
+      // On cache invalidation of delete files, clear the whole summary.
+      // Since the summary contained both data files and DVs, add back the 
data files.
+      addedFilesSummary.clear();

Review Comment:
   I don't think there's a clean way to make the field track specifically data 
files without splitting the summary state into two fields, and we largely want 
to avoid new state in MergingSnapshotProducer. But that is another option, a 
separate `addedDataFIlesSummary` and `addedDeleteFilesSummary` and add both in 
`apply()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to