Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/24143 )
Change subject: IMPALA-14755: (part 2) Implement Iceberg deletion vector reading/writing ...................................................................... Patch Set 8: (8 comments) Left a few small comments, otherwise looks great! http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/exec/puffin/puffin-writer.cc File be/src/exec/puffin/puffin-writer.cc: http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/exec/puffin/puffin-writer.cc@259 PS8, Line 259: if (UNLIKELY(total_blob_size <= 0)) { : return Status("Unexpected zero-size or negative size blob allocation"); : } Since bitmap is not empty this could be a DCHECK. http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/service/client-request-state.cc@1843 PS8, Line 1843: cat_ice_op->iceberg_data_files_fb = dml->CreateIcebergDataFilesVector(); : cat_ice_op->__isset.iceberg_data_files_fb = true; This could be: cat_ice_op->__set_iceberg_data_files_fb(dml->CreateIcebergDataFilesVector()); Other than being shorter, it's also less error-prone. Same goes to the lines below. http://gerrit.cloudera.org:8080/#/c/24143/7/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java: http://gerrit.cloudera.org:8080/#/c/24143/7/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@320 PS7, Line 320: dataFileToDV_ > You mean moving away from TIcebergDeletionVector in the FE reading part? I Yeah I mean if we convert it to FbIcebergDeletionVector anyway which contains all useful information, then we could just cache the bytebuffer of FbIcebergDeletionVector. But it could be implemented in a different ticket. http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java: http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@423 PS8, Line 423: new IcebergDeleteJoinNode(dataScanNode, deleteScanNode, positionJoinConjuncts); IcebergDeleteJoinNode's cardinality estimation should take the DVs' record counts into account. It currently doesn't happen when there are no position delete files, only DVs. http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/util/IcebergUtil.java File fe/src/main/java/org/apache/impala/util/IcebergUtil.java: http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1363 PS8, Line 1363: 1 We should use a larger initial capacity, or use the default constructor. In the end we create a shrinked buffer anyway. Same goes to L1151. http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-existing-dv.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-existing-dv.test: http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-existing-dv.test@86 PS8, Line 86: ==== Can we compact this table? http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test File testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test: http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test@54 PS8, Line 54: SELECT content, file_format, referenced_data_file IS NOT NULL, content_offset IS NOT NULL You could check 'record_count' as well. http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test@67 PS8, Line 67: ==== Can we compact this table? -- To view, visit http://gerrit.cloudera.org:8080/24143 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5613c31a7aa46b94b7c70386c939c08cc68632cd Gerrit-Change-Number: 24143 Gerrit-PatchSet: 8 Gerrit-Owner: Peter Rozsa <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Tue, 14 Apr 2026 15:28:20 +0000 Gerrit-HasComments: Yes
