[GitHub] [incubator-pinot] Jackie-Jiang opened a new issue #5925: Optimize query filtering on the result of another query

GitBox Tue, 25 Aug 2020 19:24:11 -0700


Jackie-Jiang opened a new issue #5925:
URL: https://github.com/apache/incubator-pinot/issues/5925



   Example query:
   `SELECT COUNT(*) FROM table1 WHERE id1 IN (SELECT id2 FROM table2 WHERE ...) 
...`
   
   A naive solution would be sending the sub-query first, gather the result, 
then use the result to construct the main query. It works for cases where the 
result for the sub-query is small (< 100), but when the result size becomes big 
(> 1000), the cost of ser/de, query compilation and query processing will be 
too high.
   
   In order to optimize this query, we need to reduce the number of ids to 
process. We can rely on the partitioning to achieve that. When all the segments 
for a partition is on a single server, we can solve the query for the partition 
on the server side without sending and merging the result of the sub-query on 
the broker.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [incubator-pinot] Jackie-Jiang opened a new issue #5925: Optimize query filtering on the result of another query

Reply via email to