merlimat opened a new pull request, #25980:
URL: https://github.com/apache/pulsar/pull/25980

   Implements [PIP-483: Scalable Topic Auto 
Split/Merge](https://github.com/apache/pulsar/blob/master/pip/pip-483.md) 
(sub-PIP of PIP-468). The controller now scales a scalable topic's segment 
count automatically, default-on cluster-wide, bounded by hard caps and 
asymmetric cooldowns.
   
   End-to-end flow: each broker reports per-segment load to the metadata store 
→ the controller leader reads load + consumer counts → a pure decision function 
decides → it dispatches to the existing split/merge protocols.
   
   ## Increments (one commit each, each independently tested)
   
   1. **Decision core** — `AutoScalePolicyEvaluator.decide(...)`, a pure 
I/O-free function: split pass (consumer-count then traffic, gated by 
`splitCooldown` + `maxSegments`) then merge pass (cold adjacent pair, gated by 
`mergeCooldown` + `mergeWindow` + `minSegments` + `maxDagDepth`). Plus 
`SegmentLoadStats`, `AutoScaleConfig`, `AutoScaleDecision`, 
`SegmentLoadSample`, and `SegmentLayout.mergeDepth()`.
   2. **Broker config + resolver** — 17 `scalableTopic*` knobs (auto-scale on, 
`maxSegments=64`, `maxDagDepth=10`, `splitCooldown=60s`, 
`mergeCooldown/Window=300s`, four split + four merge rate thresholds, etc.) and 
`AutoScaleConfig.fromBrokerConfig()`.
   3. **Load record + reporter** — `ScalableTopicResources` load get/put/delete 
paths and `SegmentLoadReporter`, which writes a sample only when a rate changed 
materially (default ±25%) or crossed zero, keeping metadata write volume 
bounded.
   4. **Controller wiring** — periodic `AutoScaleTick` (traffic) + event-driven 
evaluation on stream/checkpoint consumer registration (consumer-count scale-up 
within seconds, no polling). Serialized by an in-flight guard; 
`splitSegment`/`mergeSegments` set the cooldowns so manual ops count too.
   5. **Broker sweep** — `BrokerService` periodically sweeps the `segment://` 
topics it hosts, reads their in/out msg+byte rates, and feeds the reporter. 
This populates the records the controller reads.
   
   ## Design highlights
   
   - **Splits are fast, merges are lazy.** Splits fire as soon as conditions 
hold (only `splitCooldown` coalesces bursts); merges require a durable cold 
window and a longer cooldown.
   - **Max DAG depth caps merges, not splits** — bounds split↔merge 
flip-flopping while never blocking a throughput-driven split.
   - **The controller reacts, it doesn't poll** — load is pushed to metadata on 
material change; the controller reads it. No cross-broker RPC fan-out.
   
   ## Test plan
   
   - [x] `AutoScalePolicyEvaluatorTest` (20 cases) — every split/merge rule, 
caps, cooldowns, hysteresis, adjacency, max-depth.
   - [x] `AutoScaleConfigTest`, `SegmentLoadReporterTest`, `SegmentLayoutTest` 
(mergeDepth).
   - [x] `ScalableTopicControllerAutoScaleTest` — load-driven split, 
consumer-driven split (event path), cold-pair merge, disabled no-op, cooldown 
blocks second split (against in-memory store + mocked admin).
   - [x] `V5SegmentLoadReporterTest` — end-to-end: producing across a scalable 
topic + running the sweep writes a load record per segment.
   - [x] Full `org.apache.pulsar.broker.service.scalable.*` suite + checkstyle.
   
   ## Follow-up (separate PR)
   
   Per-namespace and per-topic `AutoScalePolicyOverride` + 
`admin.scalableTopics().set/get/removeAutoScalePolicy(...)`. The feature is 
fully functional via the cluster config without it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to