Thanks, overall LGTM.

AK1_5: I see paused-partitions-count and paused-partitions both representing 
int gauges. Can we unify into simply paused-partitions? *-Count can also point 
users towards windowed/cumulative count which isn’t the case here.

-Aditya

On 2026/04/09 12:43:10 PoAn Yang wrote:
> Hi Chia-Ping, Aditya, and Sahil,
>
> Thanks for your suggestion.
>
> chia_00, AK1, SD2: Align the naming is better. Change all metrics with
> prefix `paused-partitions`.
>
> chia_01: After checking the code again, it's better to follow current
> per-partition metrics like records-lag.
> I change per-partition paused-partitions* metrics to use INFO level.
>
> AK2: Change paused-partitions-count to consumer-coordinator-metrics group.
>
> AK3: Add paused-partitions-rate/paused-partitions-total to both consumer
> and per-partition levels.
> Since existing per-partition metrics use INFO level, I change to use INFO
> as well.
>
> AK4: Add a note about cardinality to consumer-fetch-manager-metrics
> paragraph.
>
> SD1: Change to use -1 as default value
> for paused-partitions-duration-seconds.
>
> SD3: Mention per-partition metrics are reset on partition reassignment in
> Proposed Changes.
>
> Kind regards,
> PoAn
>
> Sahil Devgon <[email protected]> 於 2026年4月7日週二 下午12:17寫道:
>
> > Hello PoAn,
> > Thanks for the KIP, I have a few comments that we may consider adding to
> > the KIP:
> > 1. One thing I noticed is for partition-paused-time-ms, returning 0 when a
> > partition is not paused could be slightly ambiguous since it's the same
> > value a freshly paused partition would return. Would you consider returning
> > -1 to indicate "not paused" (consistent with how partition-paused uses
> > 0/1)? Or if 0 is preferred, a clear doc note would go a long way in
> > preventing false positives in monitoring setups.
> >
> > 2. Adding to Chia-Ping and Aditya's naming suggestions,
> > partition-paused-time-ms reads as "time in milliseconds" but semantically
> > it measures elapsed duration since pause. A name like
> > paused-partition-duration-ms/paused-partition-duration-seconds would better
> > communicate intent and align with naming conventions used in other Kafka
> > duration metrics (e.g., records-lag,fetch-latency-avg).
> >
> > 3. The test plan mentions verifying that the pause timestamp is "reset on
> > partition reassignment" , it would be helpful to also describe this
> > behavior explicitly in the Proposed Changes section, not just the test
> > plan. For example, calling out that the pause state is cleared on
> > reassignment regardless of prior pause status would make the spec feel
> > complete. This is especially relevant for rebalance-heavy workloads where
> > partitions move around frequently.
> >
> > Best,
> > Sahil Devgon
> >
> > On Mon, Apr 6, 2026 at 4:34 PM PoAn Yang <[email protected]> wrote:
> >
> > > Hello everyone,
> > >
> > > I would like to start a discussion thread on KIP-1304. In this KIP, we
> > > plan to add new consumer metrics about paused partitions.
> > >
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1304%3A+Add+consumer+metric+about+paused+partitions
> > >
> > > Please take a look and feel free to share any thoughts.
> > >
> > > Thanks,
> > > PoAn
> >
>

Reply via email to