Copilot commented on code in PR #60228:
URL: https://github.com/apache/doris/pull/60228#discussion_r2726300456


##########
gensrc/thrift/PaloInternalService.thrift:
##########
@@ -423,7 +423,7 @@ struct TQueryOptions {
   // runtime profiling to choose the most efficient algorithm for the data 
pattern
   183: optional bool enable_use_hybrid_sort = false;
   184: optional i32 cte_max_recursion_depth;
-
+  185: optional bool enable_aggregate_function_null_v2 = false;

Review Comment:
   The PR description is missing critical information. It should include:
   1. What problem does aggregate_function_null_v2 solve?
   2. What are the differences between V1 and V2?
   3. Why is this change needed?
   4. What testing has been done, especially for multi-argument aggregate 
functions?
   
   The Issue Number field shows "close #xxx" which is a placeholder and should 
reference an actual issue.



##########
be/src/pipeline/exec/aggregation_source_operator.cpp:
##########
@@ -500,9 +500,6 @@ Status 
AggLocalState::merge_with_serialized_key_helper(vectorized::Block* block)
     for (int i = 0; i < Base::_shared_state->aggregate_evaluators.size(); ++i) 
{
         auto col_id = Base::_shared_state->probe_expr_ctxs.size() + i;
         auto column = block->get_by_position(col_id).column;
-        if (column->is_nullable()) {
-            column = 
((vectorized::ColumnNullable*)column.get())->get_nested_column_ptr();
-        }
 

Review Comment:
   The removal of nullable column unwrapping logic is incompatible with the 
incomplete V2 implementation. When enable_aggregate_function_null_v2 is true 
and multi-argument aggregate functions are used, they will receive nullable 
columns but the V1 variadic wrapper (still being used due to the bug in 
helpers.h) expects non-nullable columns. This change should not have been made 
until the V2 variadic implementation is complete.



##########
fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java:
##########
@@ -1461,6 +1462,9 @@ public enum IgnoreSplitType {
             needForward = true)
     private boolean enableParallelScan = true;
 
+    @VariableMgr.VarAttr(name = ENABLE_AGGREGATE_FUNCTION_NULL_V2, fuzzy = 
true, needForward = true)
+    private boolean enableAggregateFunctionNullV2 = true;

Review Comment:
   This PR lacks test coverage for the new aggregate_function_null_v2 feature. 
Given the critical bugs identified (incomplete V2 implementation for 
multi-argument functions and removed unwrapping logic), comprehensive tests are 
essential. Tests should cover:
   1. Single-argument aggregate functions with nullable inputs (e.g., SUM, AVG, 
COUNT with nullable columns)
   2. Multi-argument aggregate functions with nullable inputs (e.g., COVAR, 
map_agg, min_max_by)
   3. Window functions with nullable inputs
   4. Both V1 and V2 implementations to ensure compatibility
   5. Serialization and deserialization with nullable columns
   
   The existing test infrastructure in be/test/vec/aggregate_functions/ should 
be used to add tests for this feature.



##########
be/src/vec/aggregate_functions/helpers.h:
##########
@@ -85,6 +87,11 @@ struct creator_without_type {
     using NullableT = std::conditional_t<multi_arguments, 
AggregateFunctionNullVariadicInline<T, f>,
                                          AggregateFunctionNullUnaryInline<T, 
f>>;
 
+    template <bool multi_arguments, bool f, typename T>
+    using NullableV2T =
+            std::conditional_t<multi_arguments, 
AggregateFunctionNullVariadicInline<T, f>,

Review Comment:
   The V2 implementation is incomplete. The NullableV2T template alias uses 
AggregateFunctionNullVariadicInline (the old V1 implementation) for 
multi-argument functions instead of a V2 variadic implementation. This means 
when enable_aggregate_function_null_v2 is true, multi-argument aggregate 
functions will still use the V1 variadic wrapper, which expects columns to be 
unwrapped (non-nullable), but the operator code changes removed the unwrapping 
logic. This will cause multi-argument aggregate functions to fail or behave 
incorrectly.
   
   A AggregateFunctionNullVariadicInlineV2 class needs to be implemented in 
aggregate_function_null_v2.h, similar to AggregateFunctionNullUnaryInlineV2.
   ```suggestion
               std::conditional_t<multi_arguments, 
AggregateFunctionNullVariadicInlineV2<T, f>,
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/qe/SessionVariable.java:
##########
@@ -1461,6 +1462,9 @@ public enum IgnoreSplitType {
             needForward = true)
     private boolean enableParallelScan = true;
 
+    @VariableMgr.VarAttr(name = ENABLE_AGGREGATE_FUNCTION_NULL_V2, fuzzy = 
true, needForward = true)
+    private boolean enableAggregateFunctionNullV2 = true;

Review Comment:
   There is a default value mismatch. The Thrift definition sets 
enable_aggregate_function_null_v2 to false by default (line 426 in 
PaloInternalService.thrift), but the SessionVariable sets it to true by 
default. This inconsistency can lead to unexpected behavior where the FE thinks 
the feature is enabled but BE queries might not have it set. The defaults 
should be aligned.
   ```suggestion
       private boolean enableAggregateFunctionNullV2 = false;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to