I have investigated the problem and found that monitoring was serriously 
changed since 3.7(version when I got exception in 
com.codahale.metrics.servlets.MetricsServlet). Since version 3.9 it is enough 
to change behavior of DecayingEstimatedHistogramReservoir, the 
EstimatedHistogram should stay unchanged. The modification of 
DecayingEstimatedHistogramReservoir will be safe, because in opposite to 
EstimatedHistogram, the DecayingEstimatedHistogramReservoir is not used for 
Cassandra internal needs.

Also I found very strange resolution of  issue  CASSANDRA-12185 - the nothing 
done to prevent of IllegalStateException, but issue is closed. Should I reopen 
#12185 or deliver pull request in new issue?


Best regards,
Bukhtoyarov Vladimir
email jseco...@mail.ru
skype live:fanat-tdd
Github: https://github.com/vladimir-bukhtoyarov
mobile +79618096798

>Среда, 19 октября 2016, 21:12 +03:00 от Владимир Бухтояров 
><jseco...@mail.ru.INVALID>:
>
>The null(zero) values of snapshot are useless for problem analysing, because 
>it is impossible to distinguishing case when there are no events from case 
>when events were dispatched too slow. I do not see any criminal to return  
>999-th percentile as 3h when histogram configured with 3h max and any latency 
>is 4h.
>
>
>Best regards,
>Bukhtoyarov Vladimir
>email  jseco...@mail.ru
>skype live:fanat-tdd
>Github:  https://github.com/vladimir-bukhtoyarov
>mobile  +79618096798
>
>>Среда, 19 октября 2016, 20:17 +03:00 от Ken Hancock < ken.hanc...@schange.com 
>>>:
>>
>>I would suggest metrics should return null values instead of false values.
>>
>>On Wed, Oct 19, 2016 at 12:21 PM, Владимир Бухтояров <
>> jseco...@mail.ru.invalid > wrote:
>>
>>>
>>> Hi to all,
>>>
>>> I want to fix  https://issues.apache.org/jira/browse/CASSANDRA-11063
>>> This issue is very ugly for me, because when something works slow then it
>>> is impossible to capture metrics and save it to monitoring database for
>>> future investigation. Moreover when one histogram throw exception then many
>>> metrics-exporters are unable to export metrics for whole MetricRegistry(for
>>> example MetricsServlet), so when overflow happen in one histogram then I
>>> have no history data at all.
>>>
>>> I propose to implement the following changes:
>>> 1. The DecayingEstimatedHistogramReservoir and EstimatedHistogram will
>>> return maximum trackable value instead of Long.MAX_VALUE
>>> 2. The DecayingEstimatedHistogramReservoir and EstimatedHistogram will
>>> never throw IllegalStateException, instead, it will use maximum trackable
>>> value as regular value in percentile and average calculation.
>>> 3.  If anybody want to save old behavior(prefer to crash instead of
>>> inaccurate reporting) then I can add configuration parameter to save
>>> previous behavior, moreover I can leave old behavior as default, for my
>>> needs it will be enough to have some option to avoid crashes.
>>>
>>>
>>> Best regards,
>>> Bukhtoyarov Vladimir
>>> email  jseco...@mail.ru
>>> skype live:fanat-tdd
>>> Github:  https://github.com/vladimir-bukhtoyarov
>>>
>

Reply via email to