Noemi Pap-Takacs has posted comments on this change. ( http://gerrit.cloudera.org:8080/23889 )
Change subject: IMPALA-12027: Support additional details for DataSink nodes in ExecSummary ...................................................................... Patch Set 5: (9 comments) http://gerrit.cloudera.org:8080/#/c/23889/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/23889/5//COMMIT_MSG@13 PS5, Line 13: is 'was' Please be careful with the tenses. When someone reads the commit message, this will already be merged. http://gerrit.cloudera.org:8080/#/c/23889/4/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/23889/4/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@104 PS4, Line 104: protected String getLabelDetail() { > Actually, it feels a bit redundant to add one more end-to-end test just for We would like to test the presence of table name in the label details. None of those test check for it, so it would be great to add an automated test case that checks this so that we can catch behavioral changes in the future. http://gerrit.cloudera.org:8080/#/c/23889/5/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java File fe/src/main/java/org/apache/impala/planner/MultiDataSink.java: http://gerrit.cloudera.org:8080/#/c/23889/5/fe/src/main/java/org/apache/impala/planner/MultiDataSink.java@105 PS5, Line 105: Set<String> table_sinks_set = new HashSet<String>(); : for (DataSink dsink : dataSinks_) { : if (dsink.getSinkType() == TDataSinkType.TABLE_SINK) { : table_sinks_set.add(((TableSink)dsink).getLabelDetail()); : } : } : String minString = table_sinks_set.iterator().next(); : for (String dsink_name : table_sinks_set) { : if (dsink_name.length() < minString.length()) { : minString = dsink_name; : } : } : return minString; Csaba is right that it is also an assumption that the shortest string will be the real table name. We could simply output the first table name. AFAIK the first data sink is the insertSink in both inheritors - please make sure it is. Writing a comment about this logic and testing it ensures that the output remains correct even if the implementation changes. http://gerrit.cloudera.org:8080/#/c/23889/5/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/23889/5/tests/query_test/test_observability.py@288 PS5, Line 288: is populated in runtime profile correctly in every : query state "contains the correct table details" http://gerrit.cloudera.org:8080/#/c/23889/5/tests/query_test/test_observability.py@292 PS5, Line 292: _async Doesn't have to be async http://gerrit.cloudera.org:8080/#/c/23889/5/tests/query_test/test_observability.py@294 PS5, Line 294: # After completion of the admission control phase, the coordinator would have started : # and we should get a populated exec_summary. : self.client.wait_for_admission_control(handle) No need to test this stage. http://gerrit.cloudera.org:8080/#/c/23889/5/tests/query_test/test_observability.py@297 PS5, Line 297: get_runtime_profile(handle) get_exec_summary(handle) Getting the ExecSummary is enough, we don't need to parse the entire profile. http://gerrit.cloudera.org:8080/#/c/23889/5/tests/query_test/test_observability.py@299 PS5, Line 299: assert "ExecSummary:" in profile, profile : : exec_summary = re.search(r"ExecSummary:(.*?)Errors", profile, re.DOTALL) : if exec_summary: : exec_summary = exec_summary.group(1).strip() : else: : assert 1 == 0, "ExecSummary does not exist in profile" This is unnecessary, since it is not what we want to test here. http://gerrit.cloudera.org:8080/#/c/23889/5/tests/query_test/test_observability.py@314 PS5, Line 314: Please add test cases for MultiDataSink e.g. UPDATE and MERGE operations on Iceberg table. To be able to do this in this test case, you can create an Iceberg table in line 290 instead of a Hive table by using the `stored as Iceberg` syntax and setting the `format-version` table property to `2`. -- To view, visit http://gerrit.cloudera.org:8080/23889 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2652dd896f72c5c6bbe7e76facdede2a237808d5 Gerrit-Change-Number: 23889 Gerrit-PatchSet: 5 Gerrit-Owner: Surya Hebbar <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Surya Hebbar <[email protected]> Gerrit-Comment-Date: Wed, 08 Apr 2026 13:00:37 +0000 Gerrit-HasComments: Yes
