qzsee opened a new pull request, #28302:
URL: https://github.com/apache/doris/pull/28302

   ## Proposed changes
   
   Issue Number: close #xxx
   
   <!--Describe your changes.-->
   The current colocate group has the following problems:
   
   1. If there is an unrecoverable be or decommission some be, or a tablet is 
faulty and the tablet is slowly repaired, if the group is very large, the whole 
group is in a unstable state for a long time, and the colocate join is 
unavailable for a long time. One obvious problem here is that the control 
granularity of colocate balance is too coarse, and it is not reasonable to mark 
the group unstable once a tablet in the whole group is unavailable.
   
   2. colocate balance generates a large number of replica repair tasks, which 
affect other normal repair tasks
   
   Based on the above problems, the following optimization is done:
   1. If any be is unavailable, immediately replace all the be nodes in the 
unavailable buckets. If we decommission some be nodes, then we replace them one 
by one. Then, when we query, we take the intersection of query locations, and 
try not to degrade the performance of join.
   2. Perform traffic limiting for the colocate tablet repair
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at 
[d...@doris.apache.org](mailto:d...@doris.apache.org) by explaining why you 
chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to