shauryachats opened a new pull request, #14981:
URL: https://github.com/apache/pinot/pull/14981

   A new configuration to control the size of result holders for MSE is 
necessary to avoid resizing and rehashing operations in use cases where 
grouping is needed on high-cardinality columns (e.g., UUIDs).
   
   A simple query where it is necessary is 
   ```
   SELECT
   count(*)
   FROM
     table_A
   WHERE 
   (
       user_uuid NOT IN (
         SELECT
           user_uuid
         FROM
           table_B
       )
     )
   LIMIT
     100 option(useMultistageEngine=true, timeoutMs=120000, useColocatedJoin = 
true, maxRowsInJoin = 40000000)
   ```
   
   where a group by step occurs on `user_uuid` for `table_B` before the 
colocated join with `table_A` which has a high cardinality.
   
   More details in the following issue: 
https://github.com/apache/pinot/issues/14685


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to