Hi Chia-Ping, Aditya, and Sahil,

Thanks for your suggestion.

chia_00, AK1, SD2: Align the naming is better. Change all metrics with
prefix `paused-partitions`.

chia_01: After checking the code again, it's better to follow current
per-partition metrics like records-lag.
I change per-partition paused-partitions* metrics to use INFO level.

AK2: Change paused-partitions-count to consumer-coordinator-metrics group.

AK3: Add paused-partitions-rate/paused-partitions-total to both consumer
and per-partition levels.
Since existing per-partition metrics use INFO level, I change to use INFO
as well.

AK4: Add a note about cardinality to consumer-fetch-manager-metrics
paragraph.

SD1: Change to use -1 as default value
for paused-partitions-duration-seconds.

SD3: Mention per-partition metrics are reset on partition reassignment in
Proposed Changes.

Kind regards,
PoAn

Sahil Devgon <[email protected]> 於 2026年4月7日週二 下午12:17寫道:

> Hello PoAn,
> Thanks for the KIP, I have a few comments that we may consider adding to
> the KIP:
> 1. One thing I noticed is for partition-paused-time-ms, returning 0 when a
> partition is not paused could be slightly ambiguous since it's the same
> value a freshly paused partition would return. Would you consider returning
> -1 to indicate "not paused" (consistent with how partition-paused uses
> 0/1)? Or if 0 is preferred, a clear doc note would go a long way in
> preventing false positives in monitoring setups.
>
> 2. Adding to Chia-Ping and Aditya's naming suggestions,
> partition-paused-time-ms reads as "time in milliseconds" but semantically
> it measures elapsed duration since pause. A name like
> paused-partition-duration-ms/paused-partition-duration-seconds would better
> communicate intent and align with naming conventions used in other Kafka
> duration metrics (e.g., records-lag,fetch-latency-avg).
>
> 3. The test plan mentions verifying that the pause timestamp is "reset on
> partition reassignment" , it would be helpful to also describe this
> behavior explicitly in the Proposed Changes section, not just the test
> plan. For example, calling out  that the pause state is cleared on
> reassignment regardless of prior pause status would make the spec feel
> complete. This  is especially relevant for rebalance-heavy workloads where
>  partitions move around frequently.
>
> Best,
> Sahil Devgon
>
> On Mon, Apr 6, 2026 at 4:34 PM PoAn Yang <[email protected]> wrote:
>
> > Hello everyone,
> >
> > I would like to start a discussion thread on KIP-1304. In this KIP, we
> > plan to add new consumer metrics about paused partitions.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1304%3A+Add+consumer+metric+about+paused+partitions
> >
> > Please take a look and feel free to share any thoughts.
> >
> > Thanks,
> > PoAn
>

Reply via email to