[PR] [improve][ml] Warn and emit metric when cursor ack state exceeds persist limits [pulsar]

via GitHub Fri, 17 Apr 2026 04:53:54 -0700


ng-galien opened a new pull request, #25548:
URL: https://github.com/apache/pulsar/pull/25548


   Fixes #25540
   
   ### Motivation
   
   When the number of individually deleted message ranges in a cursor exceeds 
`managedLedgerMaxUnackedRangesToPersist` (default 10 000), or the number of 
batch deleted indexes exceeds `managedLedgerMaxBatchDeletedIndexToPersist` 
(default 10 000), the excess ack state is silently truncated during cursor 
persistence. On broker restart, those acks are lost and already-acknowledged 
messages are redelivered.
   
   Today there is no signal when this crossing happens. The only way to detect 
it is to actively monitor `totalNonContiguousDeletedMessagesRange` via the 
Admin API or Prometheus and compare it manually to the configured limit. Issue 
#25540 and the maintainer's follow-up comments describe the concrete impact 
(consumer saturation after restart, compounding redeliveries) and ask for both 
a log and a broker-level metric.
   
   ### Modifications
   
   **Feature (commit 1) — warn and emit metric on overflow:**
   
   - `ManagedCursorImpl.buildIndividualDeletedMessageRanges()` and 
`buildBatchEntryDeletionIndexInfoList()` emit a `WARN` log exactly once per 
crossing (edge-detected via an `AtomicBoolean`), including tuning advice: 
`managedLedgerMaxUnackedRangesToPersist`, 
`managedLedgerMaxBatchDeletedIndexToPersist`, 
`managedLedgerPersistIndividualAckAsLongArray`, 
`managedCursorInfoCompressionType`.
   - Two broker-level OpenTelemetry counters are incremented on every truncated 
persist (no topic-level attributes, so no cardinality growth):
     - `pulsar.broker.managed_ledger.cursor.persist.overflow.range.count`
     - `pulsar.broker.managed_ledger.cursor.persist.overflow.batch.index.count`
   - `conf/broker.conf` documents both signals next to the corresponding 
settings, so operators discover them while tuning the limits.
   - The overflow signal is derived from iteration state already computed, 
avoiding an `O(N)` `size()` call on the hot path. The exact range count is only 
read on the first crossing (guarded by the edge detector), so precise log 
context costs nothing on subsequent persists.
   
   **Incidental fix (commit 2) — off-by-one in the ranges cap:**
   
   While writing a reload-based regression test for the new warning, 
`buildIndividualDeletedMessageRanges` turned out to persist `maxRanges + 1` 
entries rather than `maxRanges`. The `forEach` callback used to append each 
range and then return `rangeList.size() <= maxRanges`, keeping iteration alive 
until one extra entry had been added. The regression dates back to #3819 
(2019), when the original `stream().limit(N)` was translated to the `forEach` 
callback without preserving the strict cap; #18357 later migrated to 
`forEachRawRange` and carried the bug forward. The sibling 
`buildBatchEntryDeletionIndexInfoList` has always used the correct 
check-before-add pattern.
   
   Beyond the obvious "config says 10 000, runtime persists 10 001" contract 
violation, the off-by-one would cause the new WARN log and counter to fire 
spuriously when `totalRanges == maxRanges + 1`: the list actually contained 
every range (no truncation), but the old `rangeList.size() > maxRanges` check 
reported "overflow". Shipping the new signal without this fix would undermine 
the operator trust we are trying to build.
   
   The fix restructures the callback into a check-before-add shape symmetric 
with `buildBatchEntryDeletionIndexInfoList`, using a `MutableBoolean` as the 
overflow flag. `MutableBoolean` is preferred over `AtomicBoolean` because the 
method runs under the cursor write lock — no concurrent readers — and the 
codebase already uses `MutableBoolean` in this file (e.g. `readEntries` around 
line 2219) for the same purpose. The change preserves the 
zero-`Range<T>`-allocation benefit introduced in #18357.
   
   ### Verifying this change
   
   This change added tests and can be verified as follows:
   
   - `ManagedCursorTest.testPersistOverflowRangesCounter` (parameterized over 
overflow / no-overflow / exact-limit cases): asserts that the OTel counter 
direction matches the scenario, and in the overflow case reopens the ledger 
after close and verifies exactly `maxRanges` entries are recovered — this 
second assertion is the regression guard for the off-by-one fix. It fails on 
the unpatched cursor (`expected 5 but found 6`) and passes with the fix.
   - `ManagedCursorTest.testPersistOverflowBatchIndexesCounter` (parameterized 
similarly): asserts the batch-index counter direction under overflow, 
no-overflow and exact-limit scenarios. No cap assertion needed here because the 
batch variant always used the correct pattern.
   - Counter assertions are bounded to a boolean direction (not exact counts) 
because the number of persist cycles triggered during `ledger.close()` is an 
internal cadence that we should not bind the test to.
   
   ### Does this pull request potentially affect one of the following parts:
   
   - [x] The metrics — adds two broker-level OTel counters (documented in 
`broker.conf` and described above).
   
   Other items: not affected. No API/schema/protocol changes, no default config 
changes, no threading model change (write-lock semantics preserved).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [improve][ml] Warn and emit metric when cursor ack state exceeds persist limits [pulsar]

Reply via email to