yx-keith opened a new issue, #34656:
URL: https://github.com/apache/doris/issues/34656

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Description
   
   In multi-table join scenario, The result of intermediate  join will be used 
as the input of the subsequent joins.
   During the interval between updating statistics, when we update data, 
statistics is not collected in time, If we run a multi-table join query within 
this time interval, the optimizer may choose a poor strategy because there is 
no correct statistics. 
   
   for example:
   select * from example_tbl t1 join example_tbl02 t2 on t1.city=t2.city and 
t1.city="成都" join example_tbl03 t3 on t1.city=t3.city;
   
   **this is plan:**
   
   
![image](https://github.com/apache/doris/assets/22794058/0b37b58c-3055-46cc-afd7-403d400e5af7)
   
   in this case, example_tbl02 will be broadcated to other node to join, but 
the actual situation is example_tbl02 has many many rows contains '成都',which 
may cause OOM during broadcast distribution.
   
   
   ### Solution
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to