merlimat opened a new pull request, #25365:
URL: https://github.com/apache/pulsar/pull/25365

   ## Flaky test failure
   
   ```
   org.awaitility.core.ConditionTimeoutException in 
testBlockByPublishRateLimiting
   waiting for rate limit metric assertions to match expected exact counts
   ```
   
   ## Summary
   
   - Fix flaky 
`MessagePublishBufferThrottleTest.testBlockByPublishRateLimiting` which was 
timing-sensitive in multiple ways:
     1. Intermediate metric assertions (`PAUSED == 1`, `RESUMED == 0`) were 
bare calls without `Awaitility`, racing with async OpenTelemetry metric 
collection
     2. The flush timeout (`flushFuture.get(2, TimeUnit.SECONDS)`) relied on 
specific timing relative to the 5s BookKeeper delay
     3. Final assertion expected exactly 10 PAUSED and 10 RESUMED events, but 
the exact count depends on message processing granularity which varies under 
load
   
   - Rewrite the test to be timing-independent:
     - Use `Awaitility` for all metric assertions
     - Use `>= 1` instead of exact counts for intermediate checks
     - Replace flush timeout test with synchronous `producer.flush()`
     - Assert `PAUSED == RESUMED` (balanced) at the end instead of exact counts 
— this is the actual invariant being tested (every pause must be followed by a 
resume)
   
   ## Documentation
   
   - [x] `doc-not-needed`
   (Your PR doesn't need any doc update)
   
   ## Matching PR in forked repository
   
   _No response_
   
   ### Tip
   
   Add the labels `ready-to-test` and `area/test` to trigger the CI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to