crepererum commented on code in PR #6304:
URL: https://github.com/apache/arrow-rs/pull/6304#discussion_r1732449253
##########
arrow-select/src/filter.rs:
##########
@@ -166,7 +166,24 @@ pub fn prep_null_mask_filter(filter: &BooleanArray) ->
BooleanArray {
/// assert_eq!(c, &Int32Array::from(vec![5, 8]));
/// ```
pub fn filter(values: &dyn Array, predicate: &BooleanArray) ->
Result<ArrayRef, ArrowError> {
- let predicate = FilterBuilder::new(predicate).build();
+ let mut filter_builder = FilterBuilder::new(predicate);
+
+ fn multiple_arrays(data_type: &DataType) -> bool {
Review Comment:
could you move this method to the top-level? I know that Rust allows that
but I think this feature confuses people more than it helps, esp. since this
does NOT create a closure, i.e. variable capture is not possible.
##########
arrow-select/src/filter.rs:
##########
@@ -166,7 +166,24 @@ pub fn prep_null_mask_filter(filter: &BooleanArray) ->
BooleanArray {
/// assert_eq!(c, &Int32Array::from(vec![5, 8]));
/// ```
pub fn filter(values: &dyn Array, predicate: &BooleanArray) ->
Result<ArrayRef, ArrowError> {
- let predicate = FilterBuilder::new(predicate).build();
+ let mut filter_builder = FilterBuilder::new(predicate);
+
+ fn multiple_arrays(data_type: &DataType) -> bool {
+ match data_type {
+ DataType::Struct(fields) => {
+ fields.len() > 1 || fields.len() == 1 &&
multiple_arrays(fields[0].data_type())
+ }
+ DataType::Union(fields, UnionMode::Sparse) => !fields.is_empty(),
+ _ => false,
+ }
+ }
+
+ if multiple_arrays(values.data_type()) {
+ filter_builder = filter_builder.optimize();
Review Comment:
what happens if you call `optimize` if the data type does NOT have multiple
arrays? I'm wondering if we really need this branch or if we could optimize
unconditionally. Maybe if "optimize" doesn't really "optimize" in all cases, we
should fix that?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]