Re: [I] CaseWhen does not work with custom implemented column expression [datafusion]

via GitHub Mon, 04 May 2026 23:49:20 -0700


gstvg commented on issue #21231:
URL: https://github.com/apache/datafusion/issues/21231#issuecomment-4377092569


   At #21323, during physical planning, lambda scope schemas are created by 
*appending* the lambda parameters to the outer schema, so that the outer schema 
fields still exists in the same position, regardless of being referenced or not
   (how it handle name shadowing I believe it's not relevant for this 
discussion). Given this query:
   
   ```sql
   create table t as select [[1, 2]] as b, 2 as b;
   
   select array_transform(a, arr -> array_transform(arr, v -> v + b)) from t;
   ```
   
   The schemas would be:
   
   ```text
   [0 => a, 1 => b]
   [0 => a, 1 => b, 2 => arr]
   [0 => a, 1 => b, 2 => arr, 3 => v]
   ```
   
   The planned query with indices, which would be what is visible via tree node 
traversals:
   
   ```sql
   select array_transform(a@0, arr@2 -> array_transform(arr@2, v@3 -> v@3 + 
b@1)) from t;
   ```
   
   So every scoped schema is a superset of the root schema, and every column 
reference, regardless of being
   within a lambda scope, has an index that it's valid relative to the root 
schema (lambda variable references uses a different expression type)
   
   Then, as an internal optimization, not visible to the external world, 
similar to the case optimization,
   lambda derives a projected body to work with projected batches that doesn't 
include unrefereced columns or lambda variables:
   
   ```text
   [0 => a, 1 => b]
   [0 => b, 1 => arr]
   [0 => b, 1 => v]
   ```
   
   ```sql
   select array_transform(a@0, arr@1 -> array_transform(arr@1, v@1 -> v@1 + 
b@0)) from t;
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] CaseWhen does not work with custom implemented column expression [datafusion]

Reply via email to