ankitsultana opened a new pull request, #15698:
URL: https://github.com/apache/pinot/pull/15698

   # Summary
   
   Completes the first E2E working version of the new Physical Optimizer. 
There's quite a bit of refactoring and feature porting that still needs to be 
done in order to make this usable generally, and I am hoping to spend May doing 
that.
   
   ## Maintaining Request Id in Context
   
   The current QueryEnvironment I feel is quite bloated and I know several 
folks have also raised this. For now I have added the requestId to 
`QueryEnvironment#Config`. This eliminates passing the requestId across method 
calls like `toDispatchableSubPlan` so I think it is overall still a relatively 
clean approach.
   
   ## Determining When Physical Optimizer is Enabled
   
   I have added a new query option, that is *temporary*: 
`usePhysicalOptimizer=true/false`. By default this is assumed false. When we 
use the physical optimizer, we need to skip certain sections of the HepProgram 
and I have made those changes accordingly.
   
   ## Plan Fragmenter and Mailbox Assignment
   
   This does the following:
   
   1. Converts PRelNode tree to PlanNode tree.
   2. Creates plan fragments
   3. Assigns Mailbox and sets them in the Dispatchable Plan Metadata
   
   This is not unit tested right now and I am working on a follow-up PR to add 
unit tests for this and some other parts of the code.
   
   ## PinotLogicalQueryPlanner changes
   
   I think we really need to clean up some of these classes. e.g. 
PinotLogicalQueryPlanner creates the dispatchable plan which is misleading.
   
   I hope to do this incrementally. First, I'll backport the missing features 
from the existing MSE Optimizer, then work on adding sufficient testing to make 
sure the new optimizer can be made the main optimizer, and then work on 
cleaning up the rest of the optimizer structure and removing the old optimizer 
code.
   
   My hope is to wrap all of this up in H1.
   
   # Test Plan
   
   ## Unit Tests
   
   Have added Unit Test cases which have decent coverage. Check the JSON Plan 
output file.
   
   ## Cluster Testing
   
   We have tested this in our cluster and even for some of the simple query 
shapes on low amount of data, the perf improvement is around 2x, but that may 
be solely because we don't yet support semi-join dynamic filters in the 
Physical Optimizer. We'll be running more benchmarks this month to compare the 
difference between the old and the new optimizer on one of our bigger clusters.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to