rdblue commented on code in PR #11240: URL: https://github.com/apache/iceberg/pull/11240#discussion_r1813640000
########## format/spec.md: ########## @@ -619,19 +627,25 @@ Data files that match the query filter must be read by the scan. Note that for any snapshot, all file paths marked with "ADDED" or "EXISTING" may appear at most once across all manifest files in the snapshot. If a file path appears more than once, the results of the scan are undefined. Reader implementations may raise an error in this case, but are not required to do so. -Delete files that match the query filter must be applied to data files at read time, limited by the scope of the delete file using the following rules. +Delete files and deletion vector metadata that match the filters must be applied to data files at read time, limited by the following scope rules. +* A deletion vector must be applied to a data file when all of the following are true: + - The data file's `file_path` is equal to the deletion vector's `referenced_data_file` + - The data file's data sequence number is _less than or equal to_ the deletion vector's data sequence number + - The data file's partition (both spec and partition values) is equal [4] to the deletion vector's partition * A _position_ delete file must be applied to a data file when all of the following are true: + - The data file's `file_path` is equal to the delete file's `referenced_data_file` if it is non-null - The data file's data sequence number is _less than or equal to_ the delete file's data sequence number - The data file's partition (both spec and partition values) is equal [4] to the delete file's partition + - There is no deletion vector that must be applied to the data file (when added, such a vector must contain all deletes from existing position delete files) Review Comment: @szehon-ho, this does not apply to equality deletes. This is a fantastic question to consider. It would be great if we could add a rule like this one for equality deletes because writers that produce positional deletes will almost always apply equality deletes and encode them in the new DVs. However, there's at least one big issue: concurrent commits would require re-scanning data. Imagine a writer is nearly done with a MERGE that uses DVs and has created a DV with all of the positions previously deleted by both positional and equality delete files. While the metadata update and commit is happening, a new equality delete comes in and commits first. If we were to require that the DV of the MERGE commit replaces the new equality deletes, the MERGE job would potentially need to go re-scan data files in order to find deleted row positions. That could be a very expensive operation because the equality delete could apply to an entire partition of data. There is also a similar cost for maintaining position deletes, but in that case we have the metadata to load and union DVs quickly, as opposed to scanning potentially hundreds of files for newly deleted records at commit time. Because position deletes are much more targeted, there are fewer false positives (one goal of this update!) and the work to merge them is lower. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org