weizuo93 opened a new issue #4329:
URL: https://github.com/apache/incubator-doris/issues/4329


   When creating a tablet, it is necessary to select a disk from all disks that 
meet the requirements on the BE node to store the tablet. In doris, the current 
disk selection strategy is to randomly select a disk from all disks that meet 
the requirements for tablet creation.  After the cluster has been running for a 
long time, we found that  the distribution of the number of tablets on 
different disks in a BE node is severely unbalanced. 
   
   After analysis, we think the main reason for this problem is that the random 
disk selection strategy used in doris does not consider the problem of load 
balance between different disks on a BE node.  A good disk selection strategy 
should take into account both the randomness of selection and disk load. In 
order to solve this problem, we introduced the the  algorithm of "two random 
choices" for disk selection when creating the tablet: 
   (1) Select two disks from all disks that meet the requirements on the BE 
node randomly;
   (2)  Choose the disk with a smaller number of  tablet from the two disks 
selected in (1) for tablet creation.
   
   In the initial state, the distribution of tablets among different disks is 
balanced. The simulation  experiments show that the distribution of tablets on 
different disks in the BE node always remains relatively balanced with the 
creation of tablets after used the "two random choices" disk selection 
algorithm. (The top figure shows the range(极差) trend of tablet distribution on 
different disks in a BE node with the creation of tablets.  The bottom figure 
shows the standard deviation(标准差) trend of tablet distribution on different 
disks in a BE node with the creation of tablets. In these two figures, the red 
line indicates the random disk selection strategy used in doris for tablet 
creation and the blue line indicates "two random choices"  disk selection 
strategy we introduced for tablet creation )
   
![figure_1](https://user-images.githubusercontent.com/68884553/89902245-cd4e2d80-dc18-11ea-8fc7-d1b1390bda85.jpeg)
   
   In the initial state, the distribution of tablets among different disks is 
unbalanced. The simulation  experiments show that the "two random choice" disk 
selection algorithm can also gradually balance the load of different disks with 
the creation of tablets. (The top figure shows the range(极差) trend of tablet 
distribution on different disks in a BE node with the creation of tablets.  The 
bottom figure shows the standard deviation(标准差) trend of tablet distribution on 
different disks in a BE node with the creation of tablets. In these two 
figures, the red line indicates the random disk selection strategy used in 
doris for tablet creation and the blue line indicates "two random choices"  
disk selection strategy we introduced for tablet creation )
   
![figure_2](https://user-images.githubusercontent.com/68884553/89902275-d8a15900-dc18-11ea-8344-fa24e4efe803.jpeg)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to