stevenzwu commented on code in PR #13189: URL: https://github.com/apache/iceberg/pull/13189#discussion_r2191086264
########## format/spec.md: ########## @@ -101,10 +101,10 @@ Inheriting the sequence number from manifest metadata allows writing a new manif Row-level deletes are stored in delete files. -There are two ways to encode a row-level delete: - -* [_Position deletes_](#position-delete-files) mark a row deleted by data file path and the row position in the data file Review Comment: Ah. I see your point now. The `### Delete Formats` section already mentioned three types of row-level deletes, which is inaccurate. You are trying to match that here. ``` There are three types of row-level deletes: ``` In my mind. There are *two* types of row-level deletes (position and equality). But there are *three* types of delete file format. The above spec can probably be changed from `row-level deletes` to `delete file formats`. The section title of `Delete Formats` is also not completely accurate, as it also contains a child section of `Delete file stats` Without changing the `Delete Formats` section (except for fix the wording above) if we clarify this overview as following ``` There are two types of row-level deletes. * _Position deletes_ mark a row deleted by data file path and the row position in the data file. Position deletes are encoded in a [_position delete file_](#position-delete-files) (V2 or below) or [_delete vector_](#deletion-vectors) (V3 or above) * _Equality deletes_ mark a row deleted by one or more column values, like `id = 5`. Equality deletes are encoded in [_equality delete file_](#equality-delete-files). Like data files, delete files are tracked by partition. In general, a delete file must be applied to older data files with the same partition; see [Scan Planning](#scan-planning) for details. Column metrics can be used to determine whether a delete file's rows overlap the contents of a data file or a scan range. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org