zhaizhibo opened a new pull request, #25626:
URL: https://github.com/apache/pulsar/pull/25626

   
   ### Motivation
   
   When validating namespace bundle ranges via `validateNamespaceBundleRange`, 
the method calls `NamespaceBundleFactory.getBundles(NamespaceName, 
BundlesData)` which constructs a new `NamespaceBundles` object every time. This 
object contains all bundles for the entire namespace, and its construction 
involves expensive string formatting operations (e.g., `toString()` for each 
bundle boundary).
   
   For a namespace with N bundles, operations like `unload` or `clearBacklog` 
that iterate over all bundles will call `validateNamespaceBundleRange` for each 
one, resulting in O(N²) total construction work. For example, a namespace with 
4000 bundles would require ~16,000,000 string operations during a single 
unload, as each of the 4000 bundle validations redundantly reconstructs all 
4000 bundle objects.
   
   The `NamespaceBundleFactory` already maintains a cache (`bundlesCache`) via 
`getBundlesAsync(NamespaceName)` that computes `NamespaceBundles` once and 
reuses it. Using this cache eliminates the repeated O(N) construction per 
validation, reducing the total work from O(N²) to O(N).
   
   ### Modifications
   
   
   - Add `validateNamespaceBundleRangeAsync(NamespaceName, String)` that uses 
`getBundlesAsync()` (cached) instead of `getBundles(NamespaceName, 
BundlesData)` (re-constructed each call). 
   - Change `isBundleOwnedByAnyBroker` to use 
`validateNamespaceBundleRangeAsync`, removing the `BundlesData` parameter since 
bundles are now fetched from cache rather than passed by the caller.
   - Replace the original usage of `validateNamespaceBundleRange `with the 
corresponding asynchronous call `validateNamespaceBundleRangeAsync`.
   
   ### Verifying this change
   
   - [ ] Make sure that the change passes the CI checks.
   
   
   - [ ] Dependencies (add or upgrade a dependency)
   - [ ] The public API
   - [ ] The schema
   - [ ] The default values of configurations
   - [ ] The threading model
   - [ ] The binary protocol
   - [ ] The REST endpoints
   - [ ] The admin CLI options
   - [ ] The metrics
   - [ ] Anything that affects deployment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to