github-actions[bot] commented on code in PR #64436:
URL: https://github.com/apache/doris/pull/64436#discussion_r3418604268


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathExpressionCollector.java:
##########
@@ -618,6 +618,39 @@ private Void collectArrayPathInLambda(Lambda lambda, 
CollectorContext context) {
         } finally {
             nameToLambdaArguments.pop();
         }
+
+        // After visiting the lambda body, for any bound array whose lambda 
variable
+        // was NOT referenced in the body (e.g. x -> true where x never 
appears),
+        // visitArrayItemSlot was never called and the array column's access 
path is
+        // missing. This gap is exposed when an is-null or offset-only path 
has been
+        // registered for the same slot — NestedColumnPruning then incorrectly 
prunes
+        // the complex column to null-only / offset-only instead of reading 
full data.
+        //
+        // Detect usage by scanning the lambda body for ArrayItemSlots 
matching the
+        // argument name, which is more reliable than getInputSlots() that 
deliberately
+        // excludes ArrayItemSlot and may falsely match outer slots.
+        //
+        // Must use a fresh context: when the body DOES reference some 
variables
+        // (e.g. (x,y) -> x > 0), visitArrayItemSlot mutates 
context.accessPathBuilder
+        // in-place (addPrefix without cleanup). A fresh context isolates the 
fallback
+        // path for unreferenced variables from pollution by referenced ones.
+        for (Expression argument : arguments) {
+            if (argument instanceof ArrayItemReference) {
+                String argName = ((ArrayItemReference) argument).getName();
+                boolean isReferenced = arguments.get(0)
+                        .<ArrayItemSlot>collect(e -> e instanceof 
ArrayItemSlot)
+                        .stream()
+                        .anyMatch(slot -> slot.getName().equals(argName));

Review Comment:
   This current-head check is still not scoped to the lambda argument being 
tested. It now scans `ArrayItemSlot`s, but it matches only by name and descends 
into nested lambdas, where the analyzer creates distinct `ArrayItemReference` 
expr ids and allows an inner lambda to shadow the same name. For example, 
`SELECT id, a IS NULL, array_count(x -> array_count(x -> x > 0, b) > 0, a)` has 
an outer `x` bound to `a` that is not referenced, while the inner `x` is bound 
to `b`. This scan sees the inner `x`, sets `isReferenced` for the outer 
argument, and skips the fallback, so if `a IS NULL` already registered `[a, 
NULL]` the original null-only pruning hole remains. Please key this to the 
current argument identity, or track which current-frame arguments 
`visitArrayItemSlot` actually resolved, instead of matching names across nested 
scopes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to