Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/24143 )

Change subject: IMPALA-14755: (part 2) Implement Iceberg deletion vector 
reading/writing
......................................................................


Patch Set 8:

(8 comments)

Left a few small comments, otherwise looks great!

http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/exec/puffin/puffin-writer.cc
File be/src/exec/puffin/puffin-writer.cc:

http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/exec/puffin/puffin-writer.cc@259
PS8, Line 259:   if (UNLIKELY(total_blob_size <= 0)) {
             :     return Status("Unexpected zero-size or negative size blob 
allocation");
             :   }
Since bitmap is not empty this could be a DCHECK.


http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/24143/8/be/src/service/client-request-state.cc@1843
PS8, Line 1843:       cat_ice_op->iceberg_data_files_fb = 
dml->CreateIcebergDataFilesVector();
              :       cat_ice_op->__isset.iceberg_data_files_fb = true;
This could be:

 cat_ice_op->__set_iceberg_data_files_fb(dml->CreateIcebergDataFilesVector());

Other than being shorter, it's also less error-prone. Same goes to the lines 
below.


http://gerrit.cloudera.org:8080/#/c/24143/7/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/24143/7/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@320
PS7, Line 320: dataFileToDV_
> You mean moving away from TIcebergDeletionVector in the FE reading part? I
Yeah I mean if we convert it to FbIcebergDeletionVector anyway which contains 
all useful information, then we could just cache the bytebuffer of 
FbIcebergDeletionVector. But it could be implemented in a different ticket.


http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@423
PS8, Line 423:           new IcebergDeleteJoinNode(dataScanNode, 
deleteScanNode, positionJoinConjuncts);
IcebergDeleteJoinNode's cardinality estimation should take the DVs' record 
counts into account. It currently doesn't happen when there are no position 
delete files, only DVs.


http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/24143/8/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@1363
PS8, Line 1363: 1
We should use a larger initial capacity, or use the default constructor. In the 
end we create a shrinked buffer anyway. Same goes to L1151.


http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-existing-dv.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-existing-dv.test:

http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-existing-dv.test@86
PS8, Line 86: ====
Can we compact this table?


http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test
File 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test:

http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test@54
PS8, Line 54: SELECT content, file_format, referenced_data_file IS NOT NULL, 
content_offset IS NOT NULL
You could check 'record_count' as well.


http://gerrit.cloudera.org:8080/#/c/24143/8/testdata/workloads/functional-query/queries/QueryTest/iceberg-v3-delete-v2-equality-upgrade.test@67
PS8, Line 67: ====
Can we compact this table?



--
To view, visit http://gerrit.cloudera.org:8080/24143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5613c31a7aa46b94b7c70386c939c08cc68632cd
Gerrit-Change-Number: 24143
Gerrit-PatchSet: 8
Gerrit-Owner: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Peter Rozsa <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 14 Apr 2026 15:28:20 +0000
Gerrit-HasComments: Yes

Reply via email to