wypoon commented on PR #9888: URL: https://github.com/apache/iceberg/pull/9888#issuecomment-2294424912
You have made changes in `BaseIncrementalChangelogScan` that produces `BaseDeletedRowsScanTask` in certain cases. However, you have not made changes in the Spark `ChangelogRowReader` to handle `DeletedRowsScanTask`s. As reading the changelog has only been implemented in Spark thus far, the changes cannot be consumed. You have added some tests to the Spark `TestCreateChangelogViewProcedure`, which actually makes use of `ChangelogRowReader` to read the changelog, but those tests currently don't actually test any `DeletedRowsScanTask`s, because the tables used in the test are not created with write modes set to `merge-on-read` (the default is `copy-on-write`); no delete files are ever created in the tests. I took my changes in https://github.com/apache/iceberg/pull/10935 and swapped out my `BaseIncrementalChangelogScan` implementation for yours (see https://github.com/apache/iceberg/pull/10954). I was interested to see how your `BaseIncrementalChangelogScan` implementation works. (I needed my other changes in order for Spark changelog reading to actually work.) I found that `testDataFilters` and `testDataRewritesAreIgnored` in `TestChangelogTable` failed for the merge-on-read case (in the former, the delete is not detected, and in the latter, the second delete is detected but not the first). In `TestChangelogReader`, `testPositionDeletes`, `testEqualityDeletes` and `testMixOfPositionAndEqualityDeletes` failed. It seems to me that your implementation is not catching all deletes that can happen. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org