aokolnychyi commented on code in PR #12058: URL: https://github.com/apache/iceberg/pull/12058#discussion_r1926221855
########## spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/data/vectorized/ColumnarBatchUtil.java: ########## @@ -31,10 +31,13 @@ public class ColumnarBatchUtil { private ColumnarBatchUtil() {} + // spotless:off /** Review Comment: I just realized the example actually doesn't fit anymore as we only compute the mapping, not the isDeleted array. Also, what if we use `<pre>` blocks to avoid disabling Spotless? ``` /** * Builds a row ID mapping inside a batch to skip deleted rows. * * <pre> * Initial state * Data values: [v0, v1, v2, v3, v4, v5, v6, v7] * Row ID mapping: [0, 1, 2, 3, 4, 5, 6, 7] * * Apply position deletes * Position deletes: 2, 6 * Row ID mapping: [0, 1, 3, 4, 5, 7, -, -] (6 live records) * * Apply equality deletes * Equality deletes: v1, v2, v3 * Row ID mapping: [0, 4, 5, 7, -, -, -, -] (4 live records) * </pre> * * @param columnVectors the array of column vectors for the batch * @param deletes the delete filter containing delete information * @param rowStartPosInBatch the starting position of the row in the batch * @param batchSize the size of the batch * @return the mapping array and the number of live rows, or {@code null} if nothing is deleted */ ``` I think this example would be more descriptive as it is clear what the data values are and which records are removed by equality deletes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org