Copilot commented on code in PR #60228:
URL: https://github.com/apache/doris/pull/60228#discussion_r2726300456
##########
gensrc/thrift/PaloInternalService.thrift:
##########
@@ -423,7 +423,7 @@ struct TQueryOptions {
// runtime profiling to choose the most efficient algorithm for the data
pattern
183: optional bool enable_use_hybrid_sort = false;
184: optional i32 cte_max_recursion_depth;
-
+ 185: optional bool enable_aggregate_function_null_v2 = false;
Review Comment:
The PR description is missing critical information. It should include:
1. What problem does aggregate_function_null_v2 solve?
2. What are the differences between V1 and V2?
3. Why is this change needed?
4. What testing has been done, especially for multi-argument aggregate
functions?
The Issue Number field shows "close #xxx" which is a placeholder and should
reference an actual issue.
##########
be/src/pipeline/exec/aggregation_source_operator.cpp:
##########
@@ -500,9 +500,6 @@ Status
AggLocalState::merge_with_serialized_key_helper(vectorized::Block* block)
for (int i = 0; i < Base::_shared_state->aggregate_evaluators.size(); ++i)
{
auto col_id = Base::_shared_state->probe_expr_ctxs.size() + i;
auto column = block->get_by_position(col_id).column;
- if (column->is_nullable()) {
- column =
((vectorized::ColumnNullable*)column.get())->get_nested_column_ptr();
- }
Review Comment:
The removal of nullable column unwrapping logic is incompatible with the
incomplete V2 implementation. When enable_aggregate_function_null_v2 is true
and multi-argument aggregate functions are used, they will receive nullable
columns but the V1 variadic wrapper (still being used due to the bug in
helpers.h) expects non-nullable columns. This change should not have been made
until the V2 variadic implementation is complete.
##########
fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java:
##########
@@ -1461,6 +1462,9 @@ public enum IgnoreSplitType {
needForward = true)
private boolean enableParallelScan = true;
+ @VariableMgr.VarAttr(name = ENABLE_AGGREGATE_FUNCTION_NULL_V2, fuzzy =
true, needForward = true)
+ private boolean enableAggregateFunctionNullV2 = true;
Review Comment:
This PR lacks test coverage for the new aggregate_function_null_v2 feature.
Given the critical bugs identified (incomplete V2 implementation for
multi-argument functions and removed unwrapping logic), comprehensive tests are
essential. Tests should cover:
1. Single-argument aggregate functions with nullable inputs (e.g., SUM, AVG,
COUNT with nullable columns)
2. Multi-argument aggregate functions with nullable inputs (e.g., COVAR,
map_agg, min_max_by)
3. Window functions with nullable inputs
4. Both V1 and V2 implementations to ensure compatibility
5. Serialization and deserialization with nullable columns
The existing test infrastructure in be/test/vec/aggregate_functions/ should
be used to add tests for this feature.
##########
be/src/vec/aggregate_functions/helpers.h:
##########
@@ -85,6 +87,11 @@ struct creator_without_type {
using NullableT = std::conditional_t<multi_arguments,
AggregateFunctionNullVariadicInline<T, f>,
AggregateFunctionNullUnaryInline<T,
f>>;
+ template <bool multi_arguments, bool f, typename T>
+ using NullableV2T =
+ std::conditional_t<multi_arguments,
AggregateFunctionNullVariadicInline<T, f>,
Review Comment:
The V2 implementation is incomplete. The NullableV2T template alias uses
AggregateFunctionNullVariadicInline (the old V1 implementation) for
multi-argument functions instead of a V2 variadic implementation. This means
when enable_aggregate_function_null_v2 is true, multi-argument aggregate
functions will still use the V1 variadic wrapper, which expects columns to be
unwrapped (non-nullable), but the operator code changes removed the unwrapping
logic. This will cause multi-argument aggregate functions to fail or behave
incorrectly.
A AggregateFunctionNullVariadicInlineV2 class needs to be implemented in
aggregate_function_null_v2.h, similar to AggregateFunctionNullUnaryInlineV2.
```suggestion
std::conditional_t<multi_arguments,
AggregateFunctionNullVariadicInlineV2<T, f>,
```
##########
fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java:
##########
@@ -1461,6 +1462,9 @@ public enum IgnoreSplitType {
needForward = true)
private boolean enableParallelScan = true;
+ @VariableMgr.VarAttr(name = ENABLE_AGGREGATE_FUNCTION_NULL_V2, fuzzy =
true, needForward = true)
+ private boolean enableAggregateFunctionNullV2 = true;
Review Comment:
There is a default value mismatch. The Thrift definition sets
enable_aggregate_function_null_v2 to false by default (line 426 in
PaloInternalService.thrift), but the SessionVariable sets it to true by
default. This inconsistency can lead to unexpected behavior where the FE thinks
the feature is enabled but BE queries might not have it set. The defaults
should be aligned.
```suggestion
private boolean enableAggregateFunctionNullV2 = false;
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]