merlimat opened a new pull request, #25638:
URL: https://github.com/apache/pulsar/pull/25638

   ## Summary
   
   Restructure `testLoadBalancerServiceUnitTableViewSyncer` to stop chasing 
timing bugs.
   
   - **Activate/deactivate the syncer by calling `primaryLoadManager.monitor()` 
directly** instead of forcing leader transitions via `makeSecondaryAsLeader() + 
makePrimaryAsLeader()`. The double transition serializes `playLeader()` behind 
a still-running `playFollower()` on the single-threaded `loadManagerExecutor`, 
which was the root cause of repeated 30s+ timeouts. The 60s Awaitility bumps in 
#25596 / #25427 / #25378 were treating that symptom; calling `monitor()` (the 
same hook the periodic scheduler uses) makes activation deterministic and 
synchronous.
   - **Drop pulsar4.** The original test added two extra brokers but only one 
of them ever exercised the cross-impl syncer path; the other was redundant in 
each parametrization. Always use pulsar3 with the OTHER table view impl so both 
parametrizations get equivalent coverage from half the cluster work — this also 
removes the producer-timeout hot spot on 
`persistent://pulsar/system/loadbalancer-service-unit-state`.
   - **Restore 30s Awaitility timeouts.** With `monitor()` driving syncer state 
synchronously, the longer 60s budgets are no longer needed.
   - **Reorganize into explicit phases** (activate → cross-impl lookup → 
disconnect → re-register → deactivate) with the SLA-monitor-topic durability 
check preserved.
   
   Net: 113 insertions(+), 235 deletions(-).
   
   ## Test plan
   
   - [x] Six consecutive local runs pass cleanly across both 
`serviceUnitStateTableViewClassName` parametrizations
   - [ ] CI is green


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to