ankitsultana opened a new pull request, #15698: URL: https://github.com/apache/pinot/pull/15698
# Summary Completes the first E2E working version of the new Physical Optimizer. There's quite a bit of refactoring and feature porting that still needs to be done in order to make this usable generally, and I am hoping to spend May doing that. ## Maintaining Request Id in Context The current QueryEnvironment I feel is quite bloated and I know several folks have also raised this. For now I have added the requestId to `QueryEnvironment#Config`. This eliminates passing the requestId across method calls like `toDispatchableSubPlan` so I think it is overall still a relatively clean approach. ## Determining When Physical Optimizer is Enabled I have added a new query option, that is *temporary*: `usePhysicalOptimizer=true/false`. By default this is assumed false. When we use the physical optimizer, we need to skip certain sections of the HepProgram and I have made those changes accordingly. ## Plan Fragmenter and Mailbox Assignment This does the following: 1. Converts PRelNode tree to PlanNode tree. 2. Creates plan fragments 3. Assigns Mailbox and sets them in the Dispatchable Plan Metadata This is not unit tested right now and I am working on a follow-up PR to add unit tests for this and some other parts of the code. ## PinotLogicalQueryPlanner changes I think we really need to clean up some of these classes. e.g. PinotLogicalQueryPlanner creates the dispatchable plan which is misleading. I hope to do this incrementally. First, I'll backport the missing features from the existing MSE Optimizer, then work on adding sufficient testing to make sure the new optimizer can be made the main optimizer, and then work on cleaning up the rest of the optimizer structure and removing the old optimizer code. My hope is to wrap all of this up in H1. # Test Plan ## Unit Tests Have added Unit Test cases which have decent coverage. Check the JSON Plan output file. ## Cluster Testing We have tested this in our cluster and even for some of the simple query shapes on low amount of data, the perf improvement is around 2x, but that may be solely because we don't yet support semi-join dynamic filters in the Physical Optimizer. We'll be running more benchmarks this month to compare the difference between the old and the new optimizer on one of our bigger clusters. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org