shauryachats commented on issue #14685: URL: https://github.com/apache/pinot/issues/14685#issuecomment-2625985589
> I wouldn't add too many configs for the same purpose because there is no way to make user understand them. I agree with the sentiment, but in this case, I feel a new config should be introduced for the reasons I will be expanding below. > I agree we should cap initialResultHolderCapacity with numGroupsLimit if we are not already doing so, and we might already cap it since we are configuring the max value of it. I agree, and we can do it in a separate PR since it is outside the scope of this current change. > What concerns me of setting result holder size directly to groups limit is that, if user override numGroupsLimit to a very large value (a lot of users simply overrides it to Integer.MAX_VALUE), server will directly run out of memory. This is precisely the reason we need a separate config to enable reserving the hash map of the `GroupIdGenerator` so that the user can choose the tradeoff between latency and memory (for low QPS and high cardinality such as our use case, enabling this would have a significant improvement, whereas for high QPS cases the users can choose not to toggle this). Another alternative that I suggested above is to introduce a separate config solely responsible for setting the initial size of the `GroupIdGenerator` so we decouple this from `numGroupsLimit` and the users who are setting `numGroupsLimit` to `Integer.MAX_VALUE` can set other values for initial size of `GroupIdGenerator` to denote the cardinality. Until we have some type of column stats for the group by columns, which can help us determine at runtime what the estimated cardinality of group by columns is, we might have to manually rely on this config. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org