nathanb9 opened a new issue, #23213:
URL: https://github.com/apache/datafusion/issues/23213

   **Problem**
   
   When a query computes several uncorrelated scalar-aggregate subqueries over 
the same table, DataFusion scans that table once per subquery.
   
   **Proposed rewrite**
   
   When 2 or more such subqueries share one source, fuse them into a single 
scan + aggregate, pushing each subquery's predicate into a `FILTER (WHERE ...)` 
clause:
   
   ```sql
   -- Before: two scans of t
   SELECT (SELECT count(*) FROM t WHERE a < 10),
          (SELECT avg(x)   FROM t WHERE a >= 10);
   
   -- After: one scan of t
   SELECT count(*) FILTER (WHERE a < 10),
          avg(x)   FILTER (WHERE a >= 10)
   FROM t;
   ```
   
   The source filter becomes the OR of the branch predicates, and each scalar 
subquery is replaced by a reference to the merged aggregate.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to