pepijnve commented on issue #21231:
URL: https://github.com/apache/datafusion/issues/21231#issuecomment-4378523250

   @gstvg I think the approach where the indices are kept stable in the public 
physical plan and the rewriting is an internal optimisation is very sensible. 
Very easy to reason about.
   
   This approach also aligns perfectly with the collect/project idea I sketched 
above. Would we still want to introduce methods on `PhsyicalExpr` to support 
this? We would essentially be replacing the current downcasting code with 
polymorphic method calls which could support custom column expressions provide 
we accept the 'extend/append only' model.
   
   The change you made in the lambda column capture branch where `IndexSet` was 
replaced with `BTreeSet` is an interesting one from an API design pov and is 
why I was asking questions earlier about how to collect the column indices in a 
recursive method call. I'm a bit reluctant to make the output collection type a 
part of the signature since that's up to the caller to decide. One caller may 
want to count occurrences of each index, one may not care about order while 
another does, etc. That kind of nudges in the direction of a dyn function 
callback rather than a mutable reference to some collection. Is there an 
idiomatic pattern in Rust for something this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to