merlimat opened a new pull request, #25365:
URL: https://github.com/apache/pulsar/pull/25365
## Flaky test failure
```
org.awaitility.core.ConditionTimeoutException in
testBlockByPublishRateLimiting
waiting for rate limit metric assertions to match expected exact counts
```
## Summary
- Fix flaky
`MessagePublishBufferThrottleTest.testBlockByPublishRateLimiting` which was
timing-sensitive in multiple ways:
1. Intermediate metric assertions (`PAUSED == 1`, `RESUMED == 0`) were
bare calls without `Awaitility`, racing with async OpenTelemetry metric
collection
2. The flush timeout (`flushFuture.get(2, TimeUnit.SECONDS)`) relied on
specific timing relative to the 5s BookKeeper delay
3. Final assertion expected exactly 10 PAUSED and 10 RESUMED events, but
the exact count depends on message processing granularity which varies under
load
- Rewrite the test to be timing-independent:
- Use `Awaitility` for all metric assertions
- Use `>= 1` instead of exact counts for intermediate checks
- Replace flush timeout test with synchronous `producer.flush()`
- Assert `PAUSED == RESUMED` (balanced) at the end instead of exact counts
— this is the actual invariant being tested (every pause must be followed by a
resume)
## Documentation
- [x] `doc-not-needed`
(Your PR doesn't need any doc update)
## Matching PR in forked repository
_No response_
### Tip
Add the labels `ready-to-test` and `area/test` to trigger the CI.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]