shauryachats commented on issue #14685:
URL: https://github.com/apache/pinot/issues/14685#issuecomment-2625985589

   > I wouldn't add too many configs for the same purpose because there is no 
way to make user understand them.
   
   I agree with the sentiment, but in this case, I feel a new config should be 
introduced for the reasons I will be expanding below.
   
   > I agree we should cap initialResultHolderCapacity with numGroupsLimit if 
we are not already doing so, and we might already cap it since we are 
configuring the max value of it.
   
   I agree, and we can do it in a separate PR since it is outside the scope of 
this current change.
   
   > What concerns me of setting result holder size directly to groups limit is 
that, if user override numGroupsLimit to a very large value (a lot of users 
simply overrides it to Integer.MAX_VALUE), server will directly run out of 
memory.
   
   This is precisely the reason we need a separate config to enable reserving 
the hash map of the `GroupIdGenerator` so that the user can choose the tradeoff 
between latency and memory (for low QPS and high cardinality such as our use 
case, enabling this would have a significant improvement, whereas for high QPS 
cases the users can choose not to toggle this). 
   
   Another alternative that I suggested above is to introduce a separate config 
solely responsible for setting the initial size of the `GroupIdGenerator` so we 
decouple this from `numGroupsLimit` and the users who are setting 
`numGroupsLimit` to `Integer.MAX_VALUE` can set other values for initial size 
of `GroupIdGenerator` to denote the cardinality.
   
   Until we have some type of column stats for the group by columns, which can 
help us determine at runtime what the estimated cardinality of group by columns 
is, we might have to manually rely on this config.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to