github-actions[bot] commented on code in PR #64436:
URL: https://github.com/apache/doris/pull/64436#discussion_r3418604268
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AccessPathExpressionCollector.java:
##########
@@ -618,6 +618,39 @@ private Void collectArrayPathInLambda(Lambda lambda,
CollectorContext context) {
} finally {
nameToLambdaArguments.pop();
}
+
+ // After visiting the lambda body, for any bound array whose lambda
variable
+ // was NOT referenced in the body (e.g. x -> true where x never
appears),
+ // visitArrayItemSlot was never called and the array column's access
path is
+ // missing. This gap is exposed when an is-null or offset-only path
has been
+ // registered for the same slot — NestedColumnPruning then incorrectly
prunes
+ // the complex column to null-only / offset-only instead of reading
full data.
+ //
+ // Detect usage by scanning the lambda body for ArrayItemSlots
matching the
+ // argument name, which is more reliable than getInputSlots() that
deliberately
+ // excludes ArrayItemSlot and may falsely match outer slots.
+ //
+ // Must use a fresh context: when the body DOES reference some
variables
+ // (e.g. (x,y) -> x > 0), visitArrayItemSlot mutates
context.accessPathBuilder
+ // in-place (addPrefix without cleanup). A fresh context isolates the
fallback
+ // path for unreferenced variables from pollution by referenced ones.
+ for (Expression argument : arguments) {
+ if (argument instanceof ArrayItemReference) {
+ String argName = ((ArrayItemReference) argument).getName();
+ boolean isReferenced = arguments.get(0)
+ .<ArrayItemSlot>collect(e -> e instanceof
ArrayItemSlot)
+ .stream()
+ .anyMatch(slot -> slot.getName().equals(argName));
Review Comment:
This current-head check is still not scoped to the lambda argument being
tested. It now scans `ArrayItemSlot`s, but it matches only by name and descends
into nested lambdas, where the analyzer creates distinct `ArrayItemReference`
expr ids and allows an inner lambda to shadow the same name. For example,
`SELECT id, a IS NULL, array_count(x -> array_count(x -> x > 0, b) > 0, a)` has
an outer `x` bound to `a` that is not referenced, while the inner `x` is bound
to `b`. This scan sees the inner `x`, sets `isReferenced` for the outer
argument, and skips the fallback, so if `a IS NULL` already registered `[a,
NULL]` the original null-only pruning hole remains. Please key this to the
current argument identity, or track which current-frame arguments
`visitArrayItemSlot` actually resolved, instead of matching names across nested
scopes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]