shauryachats opened a new pull request, #14981: URL: https://github.com/apache/pinot/pull/14981
A new configuration to control the size of result holders for MSE is necessary to avoid resizing and rehashing operations in use cases where grouping is needed on high-cardinality columns (e.g., UUIDs). A simple query where it is necessary is ``` SELECT count(*) FROM table_A WHERE ( user_uuid NOT IN ( SELECT user_uuid FROM table_B ) ) LIMIT 100 option(useMultistageEngine=true, timeoutMs=120000, useColocatedJoin = true, maxRowsInJoin = 40000000) ``` where a group by step occurs on `user_uuid` for `table_B` before the colocated join with `table_A` which has a high cardinality. More details in the following issue: https://github.com/apache/pinot/issues/14685 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org