walterddr opened a new issue, #11689: URL: https://github.com/apache/pinot/issues/11689
Currently: - query similar to ``` SELECT distinct a, b FROM tbl LIMIT 10 ``` will run the entire distinct set of values `(a,b)` on leaf; reshuffle based on hash-key and dedup in the intermediate stage, then finally keep 10 records at the very last stage. - similar but a much more subtle optimization is on group-by / order-by group key with limit. ``` SELECT a, SUM(b) FROM tbl GROUP BY a ORDER BY a DESC LIMIT 10 ``` a good proposal is to pushdown the sorted limit all the way to the leaf stage and only keeping the limited rows before sending data out -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org