ankitsultana opened a new pull request, #15743: URL: https://github.com/apache/pinot/pull/15743
# Summary **Goal:** Prototype the MSE Lite Mode to enable testing in our clusters so we can make informed decisions about its semantics and overall guarantees. (e.g. what is the most intuitive way to add limit to the leaf stage that is least confusing and maximally transparent to users) Prototypes the Multistage Engine Lite Mode as described in #14640. The feature is hidden behind a flag and can only be enabled if one has also set the `usePhysicalOptimizer=true` query option. This builds on top of the existing Optimizer changes and leverages many of the optimizations that are provided by that optimizer. # Specifics ## Worker Assignment Worker and Exchange are assigned using a new rule. This is because the generic Worker/Exchange assignment rule handles the general case, and for the Lite Mode we have some custom logic like picking a random server instance out of the ones assigned to the leaf stage. The assignment is quite simple: the leaf stage assignment is done by the existing `LeafStageWorkerAssignmentRule`, and the new Lite Mode assignment rule simply samples a server instance from the leaf stage workers, and uses it for the all plan nodes except the leaf stage. ## Integration with Sort/Aggregate Pushdown Both of these rules are still run since there are scenarios where we would like to push down the Sort or Aggregate to the leaf. e.g. if I have a query like `SELECT col1, COUNT(*) FROM tbl GROUP BY col1`, then the aggregate should be pushed down to the leaf stage in most cases for obvious reasons. ## Sort Insert Rule If there doesn't exist a limit already in the leaf stage, we add it. Currently I am using a hardcoded value but in the future we'll make it configurable per se. # Semantics None of the semantics in this PR are final and are subject to broader community review. Based on testing from this PR, I'll file a PEP to describe the semantics in detail. # Test Plan Added Unit Tests. We are also testing this out in our clusters. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org