merlimat opened a new pull request, #25980: URL: https://github.com/apache/pulsar/pull/25980
Implements [PIP-483: Scalable Topic Auto Split/Merge](https://github.com/apache/pulsar/blob/master/pip/pip-483.md) (sub-PIP of PIP-468). The controller now scales a scalable topic's segment count automatically, default-on cluster-wide, bounded by hard caps and asymmetric cooldowns. End-to-end flow: each broker reports per-segment load to the metadata store → the controller leader reads load + consumer counts → a pure decision function decides → it dispatches to the existing split/merge protocols. ## Increments (one commit each, each independently tested) 1. **Decision core** — `AutoScalePolicyEvaluator.decide(...)`, a pure I/O-free function: split pass (consumer-count then traffic, gated by `splitCooldown` + `maxSegments`) then merge pass (cold adjacent pair, gated by `mergeCooldown` + `mergeWindow` + `minSegments` + `maxDagDepth`). Plus `SegmentLoadStats`, `AutoScaleConfig`, `AutoScaleDecision`, `SegmentLoadSample`, and `SegmentLayout.mergeDepth()`. 2. **Broker config + resolver** — 17 `scalableTopic*` knobs (auto-scale on, `maxSegments=64`, `maxDagDepth=10`, `splitCooldown=60s`, `mergeCooldown/Window=300s`, four split + four merge rate thresholds, etc.) and `AutoScaleConfig.fromBrokerConfig()`. 3. **Load record + reporter** — `ScalableTopicResources` load get/put/delete paths and `SegmentLoadReporter`, which writes a sample only when a rate changed materially (default ±25%) or crossed zero, keeping metadata write volume bounded. 4. **Controller wiring** — periodic `AutoScaleTick` (traffic) + event-driven evaluation on stream/checkpoint consumer registration (consumer-count scale-up within seconds, no polling). Serialized by an in-flight guard; `splitSegment`/`mergeSegments` set the cooldowns so manual ops count too. 5. **Broker sweep** — `BrokerService` periodically sweeps the `segment://` topics it hosts, reads their in/out msg+byte rates, and feeds the reporter. This populates the records the controller reads. ## Design highlights - **Splits are fast, merges are lazy.** Splits fire as soon as conditions hold (only `splitCooldown` coalesces bursts); merges require a durable cold window and a longer cooldown. - **Max DAG depth caps merges, not splits** — bounds split↔merge flip-flopping while never blocking a throughput-driven split. - **The controller reacts, it doesn't poll** — load is pushed to metadata on material change; the controller reads it. No cross-broker RPC fan-out. ## Test plan - [x] `AutoScalePolicyEvaluatorTest` (20 cases) — every split/merge rule, caps, cooldowns, hysteresis, adjacency, max-depth. - [x] `AutoScaleConfigTest`, `SegmentLoadReporterTest`, `SegmentLayoutTest` (mergeDepth). - [x] `ScalableTopicControllerAutoScaleTest` — load-driven split, consumer-driven split (event path), cold-pair merge, disabled no-op, cooldown blocks second split (against in-memory store + mocked admin). - [x] `V5SegmentLoadReporterTest` — end-to-end: producing across a scalable topic + running the sweep writes a load record per segment. - [x] Full `org.apache.pulsar.broker.service.scalable.*` suite + checkstyle. ## Follow-up (separate PR) Per-namespace and per-topic `AutoScalePolicyOverride` + `admin.scalableTopics().set/get/removeAutoScalePolicy(...)`. The feature is fully functional via the cluster config without it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
