Hi Aditya, AK1_5: Change paused-partitions-count to paused-partitions.
Thanks for your suggestions. Kind Regards, PoAn > On Apr 10, 2026, at 12:00 AM, Aditya Kousik <[email protected]> wrote: > > Thanks, overall LGTM. > > AK1_5: I see paused-partitions-count and paused-partitions both representing > int gauges. Can we unify into simply paused-partitions? *-Count can also > point users towards windowed/cumulative count which isn’t the case here. > > -Aditya > > On 2026/04/09 12:43:10 PoAn Yang wrote: >> Hi Chia-Ping, Aditya, and Sahil, >> >> Thanks for your suggestion. >> >> chia_00, AK1, SD2: Align the naming is better. Change all metrics with >> prefix `paused-partitions`. >> >> chia_01: After checking the code again, it's better to follow current >> per-partition metrics like records-lag. >> I change per-partition paused-partitions* metrics to use INFO level. >> >> AK2: Change paused-partitions-count to consumer-coordinator-metrics group. >> >> AK3: Add paused-partitions-rate/paused-partitions-total to both consumer >> and per-partition levels. >> Since existing per-partition metrics use INFO level, I change to use INFO >> as well. >> >> AK4: Add a note about cardinality to consumer-fetch-manager-metrics >> paragraph. >> >> SD1: Change to use -1 as default value >> for paused-partitions-duration-seconds. >> >> SD3: Mention per-partition metrics are reset on partition reassignment in >> Proposed Changes. >> >> Kind regards, >> PoAn >> >> Sahil Devgon <[email protected]> 於 2026年4月7日週二 下午12:17寫道: >> >>> Hello PoAn, >>> Thanks for the KIP, I have a few comments that we may consider adding to >>> the KIP: >>> 1. One thing I noticed is for partition-paused-time-ms, returning 0 when a >>> partition is not paused could be slightly ambiguous since it's the same >>> value a freshly paused partition would return. Would you consider returning >>> -1 to indicate "not paused" (consistent with how partition-paused uses >>> 0/1)? Or if 0 is preferred, a clear doc note would go a long way in >>> preventing false positives in monitoring setups. >>> >>> 2. Adding to Chia-Ping and Aditya's naming suggestions, >>> partition-paused-time-ms reads as "time in milliseconds" but semantically >>> it measures elapsed duration since pause. A name like >>> paused-partition-duration-ms/paused-partition-duration-seconds would better >>> communicate intent and align with naming conventions used in other Kafka >>> duration metrics (e.g., records-lag,fetch-latency-avg). >>> >>> 3. The test plan mentions verifying that the pause timestamp is "reset on >>> partition reassignment" , it would be helpful to also describe this >>> behavior explicitly in the Proposed Changes section, not just the test >>> plan. For example, calling out that the pause state is cleared on >>> reassignment regardless of prior pause status would make the spec feel >>> complete. This is especially relevant for rebalance-heavy workloads where >>> partitions move around frequently. >>> >>> Best, >>> Sahil Devgon >>> >>> On Mon, Apr 6, 2026 at 4:34 PM PoAn Yang <[email protected]> wrote: >>> >>>> Hello everyone, >>>> >>>> I would like to start a discussion thread on KIP-1304. In this KIP, we >>>> plan to add new consumer metrics about paused partitions. >>>> >>>> >>>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1304%3A+Add+consumer+metric+about+paused+partitions >>>> >>>> Please take a look and feel free to share any thoughts. >>>> >>>> Thanks, >>>> PoAn >>> >>
