sanghyeok An created KAFKA-20418:
------------------------------------
Summary: Consider adding metrics for pending transaction markers
and oldest transaction age
Key: KAFKA-20418
URL: https://issues.apache.org/jira/browse/KAFKA-20418
Project: Kafka
Issue Type: Improvement
Reporter: sanghyeok An
Assignee: sanghyeok An
When transaction handling becomes slow, it is difficult to tell whether the
delay is coming from the transaction state log append path, the post-EndTxn
marker completion path, or transactions remaining in coordinator state longer
than expected.
The broker already exposes some transaction-related metrics, but it is still
hard to answer questions such as:
* how many transactions are currently waiting for marker completion
* whether pending marker backlog is growing or aging
* whether transactions are staying in a given state for unusually long periods
Adding a small set of metrics in this area could improve operability by making
it easier to identify transaction backlog and long-lived transactions in the
coordinator. Suggested metrics:
* pending-marker-count
* pending-marker-oldest-age-ms
* oldest-transaction-age-ms\{state}
These metrics could be useful in scenarios such as:
* transaction completion appears slow even though request handling itself is
not obviously delayed
* marker propagation is backed up due to inter-broker issues or broker-side
delays
* some transactions remain in ONGOING or PREPARE_* states for much longer than
expected
* operators need to distinguish transaction state append issues from marker
completion issues or long-lived transaction state
There is already internal transaction and pending marker state tracking, so
exposing related metrics may be feasible and useful for broker operations.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)