wypoon commented on PR #9888:
URL: https://github.com/apache/iceberg/pull/9888#issuecomment-2294424912

   You have made changes in `BaseIncrementalChangelogScan` that produces 
`BaseDeletedRowsScanTask` in certain cases. However, you have not made changes 
in the Spark `ChangelogRowReader` to handle `DeletedRowsScanTask`s. As reading 
the changelog has only been implemented in Spark thus far, the changes cannot 
be consumed.
   You have added some tests to the Spark `TestCreateChangelogViewProcedure`, 
which actually makes use of `ChangelogRowReader` to read the changelog, but 
those tests currently don't actually test any `DeletedRowsScanTask`s, because 
the tables used in the test are not created with write modes set to 
`merge-on-read` (the default is `copy-on-write`); no delete files are ever 
created in the tests.
   
   I took my changes in https://github.com/apache/iceberg/pull/10935 and 
swapped out my `BaseIncrementalChangelogScan` implementation for yours (see 
https://github.com/apache/iceberg/pull/10954). I was interested to see how your 
`BaseIncrementalChangelogScan` implementation works. (I needed my other changes 
in order for Spark changelog reading to actually work.) I found that 
`testDataFilters` and `testDataRewritesAreIgnored` in `TestChangelogTable` 
failed for the merge-on-read case (in the former, the delete is not detected, 
and in the latter, the second delete is detected but not the first). In 
`TestChangelogReader`, `testPositionDeletes`, `testEqualityDeletes` and 
`testMixOfPositionAndEqualityDeletes` failed. It seems to me that your 
implementation is not catching all deletes that can happen.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to