shauryachats opened a new pull request, #15843: URL: https://github.com/apache/pinot/pull/15843
This change enhances the robustness of MultiStageReplicaGroupSelector by enabling segment-to-instance assignment on a per-partition basis, rather than enforcing selection from a single replica group across all partitions. Fixed as part of https://github.com/apache/pinot/issues/15833. ### Previous Behavior: The existing implementation attempted to assign all segments for a query from a single replica group. If any instance within that group was unable to serve one or more segments, due to segments being in an ERROR state or otherwise unavailable—the selector would attempt the next replica group. If no single replica group could fully satisfy the query, the request would fail. This approach lacked resilience in scenarios involving partial unavailability within replica groups. Even when other replica groups could serve portions of the data, the selector would not utilize them unless they could serve the complete set of segments. ### Updated Behavior: With this refactor, replica group selection is now performed independently for each instance partition. The selector attempts to assign segments for each partition from the preferred replica group, but if that group cannot fully serve the required segments, it falls back to alternate replica groups for that partition only. The segment-to-instance assignment is therefore allowed to span multiple replica groups—one per partition—provided that each partition is internally consistent and all required segments can be served. ### Implementation Details: - Introduced InstancePartitions#getInstanceToPartitionIdMap() to support partition resolution for instances. - Refactored `tryAssigning` to build a mapping of instance partitions to the segments they require, and to perform replica group selection per instance partition. - Added getSelectedInstancesForPartition() to encapsulate partition-level fallback logic. - Extracted computeOptionalSegments() to separate the logic for handling segments that are served by instances in a non-online state. ### Testing Added a comprehensive test case verifying fallback behavior across replica groups when some segments are unavailable due to errors. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org