github-actions[bot] commented on code in PR #61232:
URL: https://github.com/apache/doris/pull/61232#discussion_r3014272141


##########
be/src/exprs/aggregate/aggregate_function_window.cpp:
##########
@@ -89,9 +98,45 @@ void register_aggregate_function_window_lead_lag_first_last(
         AggregateFunctionSimpleFactory& factory) {
     factory.register_function_both("lead", 
create_aggregate_function_window_lead);
     factory.register_function_both("lag", 
create_aggregate_function_window_lag);
-    factory.register_function_both("first_value", 
create_aggregate_function_window_first);
-    factory.register_function_both("last_value", 
create_aggregate_function_window_last);
-    factory.register_function_both("nth_value", 
create_aggregate_function_window_nth_value);
+    // FE rewrites first_value(k1, false) → first_value(k1), so 
argument_types.size() == 2
+    // means arg_ignore_null = true. Dispatch at registration to avoid runtime 
branching
+    // that would double template instantiations.
+    factory.register_function_both(
+            "first_value",
+            [](const std::string& name, const DataTypes& argument_types,
+               const DataTypePtr& result_type, const bool result_is_nullable,
+               const AggregateFunctionAttr& attr) -> AggregateFunctionPtr {
+                if (argument_types.size() == 2) {
+                    return create_aggregate_function_window_first_ignore_null(
+                            name, argument_types, result_type, 
result_is_nullable, attr);
+                }
+                return create_aggregate_function_window_first(name, 
argument_types, result_type,
+                                                              
result_is_nullable, attr);
+            });
+    factory.register_function_both(
+            "last_value",
+            [](const std::string& name, const DataTypes& argument_types,
+               const DataTypePtr& result_type, const bool result_is_nullable,
+               const AggregateFunctionAttr& attr) -> AggregateFunctionPtr {
+                if (argument_types.size() == 2) {
+                    return create_aggregate_function_window_last_ignore_null(
+                            name, argument_types, result_type, 
result_is_nullable, attr);
+                }
+                return create_aggregate_function_window_last(name, 
argument_types, result_type,
+                                                             
result_is_nullable, attr);
+            });
+    factory.register_function_both(
+            "nth_value",
+            [](const std::string& name, const DataTypes& argument_types,
+               const DataTypePtr& result_type, const bool result_is_nullable,
+               const AggregateFunctionAttr& attr) -> AggregateFunctionPtr {
+                if (argument_types.size() == 2) {
+                    return 
create_aggregate_function_window_nth_value_ignore_null(

Review Comment:
   **Bug (pre-existing, now more visible):** `nth_value` always takes 2 
arguments from FE (`column, N`), so `argument_types.size() == 2` is **always 
true**. This means `create_aggregate_function_window_nth_value` (the 
non-ignore-null variant) is unreachable dead code, and every `nth_value` call 
goes through the `_ignore_null` path.
   
   This is harmless today because `WindowFunctionNthValueImpl`'s second 
template parameter (`bool = false`) is unnamed and unused in the impl body - 
both variants compile to identical code. But it's logically wrong and 
misleading: the dispatch suggests nth_value supports ignore-null, but neither 
the FE nor the BE impl actually implements it.
   
   This was also broken in the old code (the old 
`create_function_lead_lag_first_last` had the same `argument_types.size() == 2` 
check). Since this PR is refactoring the dispatch, it would be good to clean 
this up:
   
   ```cpp
   // nth_value always has 2 args (column, N) from FE.
   // WindowFunctionNthValueImpl does not implement ignore-null logic,
   // so always use false.
   factory.register_function_both("nth_value",
       create_aggregate_function_window_nth_value);
   ```
   
   And remove `create_aggregate_function_window_nth_value_ignore_null` 
entirely. This eliminates dead code and avoids doubling nth_value 
instantiations unnecessarily.



##########
be/src/exprs/aggregate/aggregate_function_window_nth_value.cpp:
##########
@@ -20,8 +20,11 @@
 namespace doris {
 #include "common/compile_check_begin.h"
 
-CREATE_WINDOW_FUNCTION_WITH_NAME_AND_DATA(create_aggregate_function_window_nth_value,
 NthValueData,
-                                          WindowFunctionNthValueImpl);
+CREATE_WINDOW_FUNCTION_DIRECT(create_aggregate_function_window_nth_value, 
NthValueData,
+                              WindowFunctionNthValueImpl, false);
+
+CREATE_WINDOW_FUNCTION_DIRECT(create_aggregate_function_window_nth_value_ignore_null,
 NthValueData,
+                              WindowFunctionNthValueImpl, true);

Review Comment:
   This `_ignore_null` variant instantiates `WindowFunctionNthValueImpl<..., 
true>`, but `WindowFunctionNthValueImpl` declares its second parameter as `bool 
= false` (unnamed) and never uses it. The `true` vs `false` produces two 
distinct template instantiations with identical behavior - pure code bloat. 
This contradicts the PR's goal of reducing instantiations.
   
   Suggest removing this `_ignore_null` variant and the corresponding 
registration lambda in `aggregate_function_window.cpp`, replacing with a direct 
`factory.register_function_both("nth_value", 
create_aggregate_function_window_nth_value);`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to