gortiz commented on PR #11695: URL: https://github.com/apache/pinot/pull/11695#issuecomment-1738615579
> Thanks for raising this and providing the readings! Curious how you decide to configure the sliding window to be 15 minutes? What is the side effect of that? I've just decided to use 15 mins because our histograms and timers provide several windowed rate values like oneMinuteRate, fiveMinuteRate and fifteenMinuteRate, being the last the longer one. In order to actually return a correct value for that rate we need to store at least the results during the last 15 mins. The side effect is the memory the metric requires. `SlidingTimeWindowArrayReservoir` requires 128 bits per value stored. Therefore each histogram will require: ``` Size = metric_frequency * time_window_size * 128 / 8 Bytes ``` Which means that | measures per second | time window (mins) | size (MBs) | -- | -- | -- 10 | 1 | 0.0096 10 | 5 | 0.048 10 | 15 | 0.144 100 | 1 | 0.096 100 | 5 | 0.48 100 | 15 | 1.44 1000 | 1 | 0.96 1000 | 5 | 4.8 1000 | 15 | 14.4 10000 | 1 | 9.6 10000 | 5 | 48 100000 | 15 | 1440 This is one of the problems `SlidingTimeWindowArrayReservoir` implementation has. The size on heap depends on the measures per second. In case they are not controlled (as it may happen in Pinot), its footprint does not have an upper bound. HdrHistogram doesn't have this problem. Instead in HdrHistogram you define the min and max expected values (which may be also problematic in our case) and the precision you want to have. [From HdrHistogram documentation](https://github.com/HdrHistogram/HdrHistogram/blob/master/README.md): > For example, a Histogram could be configured to track the counts of observed integer values between 0 and 3,600,000,000 while maintaining a value precision of 3 significant digits across that range. Value quantization within the range will thus be no larger than 1/1,000th (or 0.1%) of any value. This example Histogram could be used to track and analyze the counts of observed response times ranging between 1 microsecond and 1 hour in magnitude, while maintaining a value resolution of 1 microsecond up to 1 millisecond, a resolution of 1 millisecond (or better) up to one second, and a resolution of 1 second (or better) up to 1,000 seconds. At its maximum tracked value (1 hour), it would still maintain a resolution of 3.6 seconds (or better). What do you think? Should we decrease the `SlidingTimeWindowArrayReservoir` time window to something like 5 mins? Should we add HdrHistogram? In the latter case it would be nice to add new methods to our metric registry to let calling code configure the min and max values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org