danielcweeks commented on code in PR #14264:
URL: https://github.com/apache/iceberg/pull/14264#discussion_r2417576263
##########
core/src/main/java/org/apache/iceberg/BaseIncrementalChangelogScan.java:
##########
@@ -71,6 +80,12 @@ protected CloseableIterable<ChangelogScanTask> doPlanFiles(
.filter(manifest ->
changelogSnapshotIds.contains(manifest.snapshotId()))
.toSet();
+ // Build delete file index for existing deletes (before the start snapshot)
+ DeleteFileIndex existingDeleteIndex =
buildExistingDeleteIndex(fromSnapshotIdExclusive);
Review Comment:
@pvary I'm not sure the case you provide is accurate because we would only
be producing changes for S3, not snapshots prior to it. If this is
incremental, it should only be the observed changes within the range, not
changes prior to it. Since deletes only affect prior data, it would have no
effect on the results of the scan.
You are correct that equality deletes do not apply to newer data, so even
with equality deletes, it would only apply to older data that would not be part
of the scan.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]