lhotari opened a new pull request, #26026:
URL: https://github.com/apache/pulsar/pull/26026
Follow-up to #26016 (which fixed #24148, "Compaction of `__change_events`
topic is blocking forceful
namespace/topic deletion"). This addresses a residual race that #26016 left
behind.
### Motivation
#26016 fixed two causes of forced topic/namespace deletion hanging while
compaction is in progress
(the poisoned `currentCompaction` cursor deletion, and the `receiveRawAsync`
enqueue-vs-close race),
but `CompactionTest#testForcedNamespaceDeleteWithInflightCompaction` still
failed intermittently
(~50%).
The compaction `RawReader`'s consumer (`RawReaderImpl.RawConsumerImpl`,
created with
`retryOnRecoverableErrors=false`) has an asymmetry between its two
reconnect-failure paths. When a
forced delete fences the topic and disconnects the consumer, the client
reconnects and the failure
surfaces at one of two stages:
- **Subscribe stage** (`ConsumerImpl.connectionOpened().exceptionally`)
consults
`isUnrecoverableError` unconditionally, so a `ServiceNotReadyException`
closes the reader promptly
and fails the in-flight read.
- **Lookup / connection stage** (`ConsumerImpl.connectionFailed`) only
consults `isUnrecoverableError`
inside `if (nonRetriableError || timeout)`. For a *retriable*
`ServiceNotReadyException` within the
lookup deadline it just reconnects — so the in-flight read never fails,
`currentCompaction` never
completes, and the forced deletion blocks past the test's 20s budget.
Which stage wins is a metadata-cache propagation race, hence the flakiness.
(This is independent of
#26009, which only stops redirect-URL pinning across reconnects and does not
change this
`connectionFailed` classification.)
### Modifications
- `RawReaderImpl.RawConsumerImpl` overrides `connectionFailed` to honor
`isUnrecoverableError` before
the retriable / lookup-deadline gate, so a fenced/deleted-topic reconnect
terminates the reader
promptly regardless of which stage surfaces the failure (mirroring the
subscribe-stage behavior).
This only affects the compaction reader
(`retryOnRecoverableErrors=false`); ordinary RawReaders keep
retrying recoverable errors unchanged.
- `PersistentTopic.asyncDeleteCursorWithCleanCompactionLedger` skips waiting
on `currentCompaction`
when the topic is already closing/deleting. A forced delete aborts the
in-flight compaction, so the
cursor deletion must not block on a compaction future that may stay
pending while its reader keeps
reconnecting. This is the deterministic guarantee, independent of client
reconnect timing or error
classification.
### Verifying this change
This change added tests and can be verified as follows:
- `CompactionTest#testForcedDeleteCompletesWhileCompactionStuck`: blocks the
compaction at the
phase-two seek so `currentCompaction` never completes, then asserts
`deleteForcefully()` still
completes promptly (deterministic — exercises the broker-side gate).
-
`RawReaderTest#testConnectionFailureTerminatesReadWhenNotRetryingRecoverableErrors`:
builds a
`retryOnRecoverableErrors=false` consumer with a pending read, invokes
`connectionFailed(ServiceNotReadyException)`, and asserts it terminates
the reader and fails the read
(deterministic — exercises the client-side override).
- `testForcedNamespaceDeleteWithInflightCompaction` passed 10/10 under
repetition with the fix; the
full `CompactionTest` and `RawReaderTest` suites pass.
### Does this pull request potentially affect one of the following parts:
*NONE of these are affected with this change.*
- [ ] Dependencies (add or upgrade a dependency)
- [ ] The public API
- [ ] The schema
- [ ] The default values of configurations
- [ ] The threading model
- [ ] The binary protocol
- [ ] The REST endpoints
- [ ] The admin CLI options
- [ ] The metrics
- [ ] Anything that affects deployment
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]