J-HowHuang opened a new pull request, #15368: URL: https://github.com/apache/pinot/pull/15368
## Description To include more info for consuming segments during table rebalance. It's important because users will likely to experience unexpected Kafka topic re-consumption. would be better to show relevant information so they can expect that. New info will be added under `rebalanceSummaryResult.segmentInfo`: ``` "consumingSegmentSummary": { "numConsumingSegmentsToBeMoved": 5, "maxBytesConsumingSegmentsToCatchUp": 81, "bytesConsumingSegmentsToCatchUpPerServer": { "Server_<redacted>": 375 } } ``` ## Changes While doing rebalance summary, `TableRebalancer` will get `consumingSegmentInfo` of the table using `ConsumingSegmentInfoReader`, which is the same as the controller API `GET /tables/{tableName}/consumingSegmentsInfo` uses, to get the latest offset of each consuming segment w.r.t. its stream partition. It also gets segment ZK metadata of these consuming segments to acquire their `startOffset`. Then the size of each consuming segment is calculated and turned into the info in the summary. If any consuming segment failed to get `consumingSegmentInfo`, `maxBytesConsumingSegmentsToCatchUp` and `bytesConsumingSegmentsToCatchUpPerServer` will become `null` ## Tests With `pinot-admin.sh QuickStart -type STREAM` and set 2 servers. Start the cluster, and remove `DefaultTenant_REALTIME` tag for one server, then dry-run rebalance the table `airlineStats_REALTIME` (10-partition Kafka stream): ``` { "jobId": "", "status": "DONE", "description": "Dry-run mode", "rebalanceSummaryResult": { "serverInfo": { ... } }, "segmentInfo": { "totalSegmentsToBeMoved": 5, "maxSegmentsAddedToASingleServer": 5, "estimatedAverageSegmentSizeInBytes": 0, "totalEstimatedDataToBeMovedInBytes": 0, "replicationFactor": { "valueBeforeRebalance": 1, "expectedValueAfterRebalance": 1 }, "numSegmentsInSingleReplica": { "valueBeforeRebalance": 10, "expectedValueAfterRebalance": 10 }, "numSegmentsAcrossAllReplicas": { "valueBeforeRebalance": 10, "expectedValueAfterRebalance": 10 }, "consumingSegmentSummary": { "numConsumingSegmentsToBeMoved": 5, "maxBytesConsumingSegmentsToCatchUp": 81, "bytesConsumingSegmentsToCatchUpPerServer": { "Server_<redacted>": 375 } } } }, "instanceAssignment": { "CONSUMING": { ... } }, "segmentAssignment": { ... } } ``` Run `POST http://localhost:9000/tables/airlineStats_REALTIME/forceCommit`, dry-run rebalance again. Notice that total segments doubled (10 consuming segments were committed), and now there's no bytes to catch up during rebalance. ``` { "jobId": "", "status": "DONE", "description": "Dry-run mode", "rebalanceSummaryResult": { "serverInfo": { ... } }, "segmentInfo": { "totalSegmentsToBeMoved": 5, "maxSegmentsAddedToASingleServer": 5, "estimatedAverageSegmentSizeInBytes": 43048, "totalEstimatedDataToBeMovedInBytes": 215240, "replicationFactor": { "valueBeforeRebalance": 1, "expectedValueAfterRebalance": 1 }, "numSegmentsInSingleReplica": { "valueBeforeRebalance": 20, "expectedValueAfterRebalance": 20 }, "numSegmentsAcrossAllReplicas": { "valueBeforeRebalance": 20, "expectedValueAfterRebalance": 20 }, "consumingSegmentSummary": { "numConsumingSegmentsToBeMoved": 0, "maxBytesConsumingSegmentsToCatchUp": 0, "bytesConsumingSegmentsToCatchUpPerServer": { "Server_100.114.242.49_7050": 0 } } } }, "instanceAssignment": { "CONSUMING": { ... } }, "segmentAssignment": { ... } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org