zanmato1984 opened a new issue, #44393:
URL: https://github.com/apache/arrow/issues/44393

   ### Describe the enhancement requested
   
   We discussed the solution for #41094 , the conclusion is that the "special 
form" is the way. Comment 
https://github.com/apache/arrow/issues/41094#issuecomment-2087716483 gives a 
thorough description of how special forms work.
   
   Here I summarize a bit: a special form "mask-ably" evaluates some of its 
subexpressions based on some masks obtained from its other subexpressions. For 
example consider `if cond then expr1 else expr2`, the result of `cond` is the 
mask, which controls which rows goes to `expr1` and which goes to `expr2`. 
Another example is logical `and`/`or`, each of its subexpressions is part of 
the mask to evaluate the rest subexpressions (boolean short-circuit).
   
   One way to implement special forms is that **every** expression selectively 
executes its kernel by respecting a selection vector (which rows this kernel 
should execute on) or a equally boolean mask. But unfortunately this isn't 
practical because we can't afford to change every (scalar) compute functions to 
support selection vector/mask all at once. So we must take an adaptive way, 
allowing functions to be selection vector/mask agnostic. To do so, a special 
form should 1) takes rows specific to each branch; 2) invoke the function of 
each branch on each group of these rows; 3) combine the results of all the 
branches by scattering each row to its original position in the input.
   
   So far we have vector function `filter`/`take` to do 1), but there isn't a 
handy utility to do 3).
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to