sanghyeok An created KAFKA-20418:
------------------------------------

             Summary: Consider adding metrics for pending transaction markers 
and oldest transaction age
                 Key: KAFKA-20418
                 URL: https://issues.apache.org/jira/browse/KAFKA-20418
             Project: Kafka
          Issue Type: Improvement
            Reporter: sanghyeok An
            Assignee: sanghyeok An


When transaction handling becomes slow, it is difficult to tell whether the 
delay is coming from the transaction state log append path, the post-EndTxn 
marker completion path, or transactions remaining in coordinator state longer 
than expected.

The broker already exposes some transaction-related metrics, but it is still 
hard to answer questions such as:
 * how many transactions are currently waiting for marker completion
 * whether pending marker backlog is growing or aging
 * whether transactions are staying in a given state for unusually long periods

Adding a small set of metrics in this area could improve operability by making 
it easier to identify transaction backlog and long-lived transactions in the 
coordinator.  Suggested metrics:
 * pending-marker-count
 * pending-marker-oldest-age-ms
 * oldest-transaction-age-ms\{state}

 These metrics could be useful in scenarios such as:
 * transaction completion appears slow even though request handling itself is 
not obviously delayed
 * marker propagation is backed up due to inter-broker issues or broker-side 
delays
 * some transactions remain in ONGOING or PREPARE_* states for much longer than 
expected
 * operators need to distinguish transaction state append issues from marker 
completion issues or long-lived transaction state

 

There is already internal transaction and pending marker state tracking, so 
exposing related metrics may be feasible and useful for broker operations.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to