pepijnve commented on issue #21231: URL: https://github.com/apache/datafusion/issues/21231#issuecomment-4378523250
@gstvg I think the approach where the indices are kept stable in the public physical plan and the rewriting is an internal optimisation is very sensible. Very easy to reason about. This approach also aligns perfectly with the collect/project idea I sketched above. Would we still want to introduce methods on `PhsyicalExpr` to support this? We would essentially be replacing the current downcasting code with polymorphic method calls which could support custom column expressions provide we accept the 'extend/append only' model. The change you made in the lambda column capture branch where `IndexSet` was replaced with `BTreeSet` is an interesting one from an API design pov and is why I was asking questions earlier about how to collect the column indices in a recursive method call. I'm a bit reluctant to make the output collection type a part of the signature since that's up to the caller to decide. One caller may want to count occurrences of each index, one may not care about order while another does, etc. That kind of nudges in the direction of a dyn function callback rather than a mutable reference to some collection. Is there an idiomatic pattern in Rust for something this? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
