gstvg commented on PR #21679:
URL: https://github.com/apache/datafusion/pull/21679#issuecomment-4323190771

   @rluvaton @comphead @LiaCastaneda @pepijnve I pushed 
https://github.com/apache/datafusion/pull/21679/changes/284b5d2be826651f9863ba09d8a789b15f7a9cc9
 to support flexible array_reduce signature, and pinged you all at the main 
entry point of the commit  
https://github.com/apache/datafusion/pull/21679#discussion_r3144259358
   
   Also pulled breaking changes from lambda capture 
(https://github.com/apache/datafusion/pull/21679/commits/5b42e48a27ca38e545a02d0ac91ec692d365dc84)
 and making lambda variable field 
optional(https://github.com/apache/datafusion/pull/21679/commits/e83107e1d10fc42139c3ddccc9f66bd42a0ac1d3),
 and removed direct support for fixed size lists in array_transform and coerced 
them to normal lists instead 
(https://github.com/apache/datafusion/pull/21679/commits/7bc1f5509eb1c39c37582e27a46f74793befc811)
   
   Finally, I noted two minor and breaking improvements but didn't want to 
delay this even more but we can still do it:
   
   1. Make logical and physical lambda able to store it's fields, and not only 
it's names:
   
   ```rust
   struct Lambda {
       pub params: LambdaParams,
       pub body: Box<Expr>,
   }
   
   pub enum LambdaParams {
       Bound(Vec<FieldRef>),
       // sql planner produces bound lambdas, a follow-on pr will add a helper 
to bound all lambdas of a expr tree and/or a logical plan
       Unbound(Vec<String>),
   }
   
   pub struct LambdaExpr {
       params: Vec<FieldRef>,
       body: Arc<dyn PhysicalExpr>,
   }
   ```
   
   This would eliminate the following two calls to lambda_parameters(it would 
only be called during sql planning), as the lambda should already contain the 
fields(unbound lambdas raises an error during planning as unbound lambda 
variable):
   
   
https://github.com/apache/datafusion/blob/c6324c4e7daf0162a320d744f6354b3a2f385f83/datafusion/physical-expr/src/higher_order_function.rs#L285-L293
   
   
https://github.com/apache/datafusion/blob/c6324c4e7daf0162a320d744f6354b3a2f385f83/datafusion/physical-expr/src/planner.rs#L408-L463
   
   For function implementors this won't change anything, but would be easier to 
write a custom physical expr planner, for example.
   
   You can see how it would look like on this commit 
https://github.com/gstvg/arrow-datafusion/commit/3a8ad01ca3ed62d2aad3633131c0041e1b04f810
   
   2. Disambiguate LambdaVariable name
   
   ```rust
   enum LambdaVariableField {
       Unbound(String),
       Bound(FieldRef),
   }
   
   struct LambdaVariable {
       pub field: LambdaVariableField,
       pub spans: Spans,
   }
   ```
   
   This would remove the ambiguity between the name property and the FieldRef 
name


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to