zjxxzjwang opened a new pull request, #25594:
URL: https://github.com/apache/pulsar/pull/25594

   
   Fixes 
   
   ### Motivation
   
   In `DataSketchesSummaryLogger#registerEvent`, the `valueMillis` (a `double`) 
was previously cast to `long` before being added to `sumAdder`:
   
   
   
   This caused **precision loss** in the `_sum` metric exposed to Prometheus:
   
   1. **Truncation of fractional parts**: Every event's sub-millisecond latency 
was silently discarded. For example, `1.8ms` was recorded as `1ms`.
   2. **Sub-millisecond latencies become zero**: When latency is less than 1ms 
(e.g., 500 microseconds = 0.5ms), the cast to `long` produces `0`, meaning 
these events contribute nothing to the sum.
   3. **Cumulative error**: In high-throughput scenarios, the accumulated 
truncation error grows continuously, causing the `_sum` metric to be 
significantly lower than the actual total latency.
   
   This leads to inaccurate `_sum` values in Prometheus Summary metrics, which 
in turn causes incorrect average latency calculations (`sum / count`).
   
   ### Modifications
   
   Replace `LongAdder` with `DoubleAdder` for `sumAdder` in 
`DataSketchesSummaryLogger` to preserve the full `double` precision of 
`valueMillis`, eliminating the lossy `(long)` cast.
   
   ### Verifying this change
   
   This change is a small refactor that fixes a data precision issue. The 
existing tests should continue to pass. The behavioral change is that 
`getSum()` now returns `double` instead of `long`, which is compatible with all 
call sites since they already accept `double` values (e.g., 
`MetricFamilySamples.Sample` constructor and `writeMetric` methods).
   
   ### Does this pull request potentially affect one of the following parts?
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [x] The metrics
   - [ ] Anything that affects deployment
   
   ### Documentation
   
   - [ ] `doc-required`
   - [x] `doc-not-needed`
   - [ ] `doc`
   - [ ] `doc-complete`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to