[ 
https://issues.apache.org/jira/browse/HBASE-29308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17950819#comment-17950819
 ] 

Kadir Ozdemir commented on HBASE-29308:
---------------------------------------

[~zhangduo], [~vjasani], after an internal discussion and positive feedback 
from [~apurtell] and [~dmanning], I decided to create this Jira. Please let me 
know your thoughts on this.

> Reducing region unavailability during region movement
> -----------------------------------------------------
>
>                 Key: HBASE-29308
>                 URL: https://issues.apache.org/jira/browse/HBASE-29308
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Kadir Ozdemir
>            Priority: Major
>
> Region movement is the process of transferring a region from one RegionServer 
> to another where the region on the source RegionServer is closed and this 
> region is opened on the target RegionServer. In the current design, the 
> region is unavailable for the period of closing the region on the source 
> RegionServer and then opening it on the target RegionServer. 
> The main operations during region close include flushing MemStore, waiting 
> for in-progress operations to complete (by acquiring the region operation 
> lock exclusively), removing compacting files, and evicting the blocks in the 
> block cache for the stores of the region. The operations for opening a region 
> include reading the region info file, checking if there are any WAL files to 
> replay, opening store files and reading metadata and possibly bloom filters. 
> It is clear that executing these steps sequentially can take some time and 
> prolong the region's unavailability.
> Most of the above operations can be done outside (before or after) the 
> region’s unavailability window. As described below, we actually need to 
> include only flushing MemStore on the source RegionServer, and then loading 
> the store files generated during this MemStore flush on the target 
> RegionServer in the unavailability window. 
> The region unavailability time can be reduced by introducing two new region 
> state WARMING and MOVING as follows:
>  # A new copy region is opened on the target RegionServer. This copy of the 
> region is not visible to HMaster and clients yet. The region is set to be in 
> state WARMING. In this state, it is not ready to serve reads or writes. The 
> WARMING state is an in-memory state and not recorded in the meta table. The 
> WARMING regions need to be cleaned up if the region move operation fails. If 
> a region remains in the WARMING state longer than a specified timeout period, 
> this operation can be executed locally on the target RegionServer after the 
> timeout.
>  # The next step is to put the region of the source RegionServer in the 
> MOVING state. This will trigger MemStore flushing. In the MOVING state, the 
> region will not accept new (read or write) operations but continue serving 
> in-progress read (gets and scans) operations. Please note as part of snapshot 
> isolation, these operations are allowed. This is essentially the initial part 
> of the region CLOSING state where the MemStore is flushed.
>  # When the region completes MemStore flushing, the target region is notified 
> that new HFiles are created for the region. The target region loads these 
> files, meaning that it  opens these files and reads its metadata. Then the 
> region state (for the region in the target RegionServer) will change to OPEN 
> and its location info will be updated with the target RegionServer in the 
> meta table, and the HMaster node will be notified about this change. Thus, 
> the region on the target RegionServer will be visible to the clients.
>  # Finally, the region on the source RegionServer will be closed.
> With this design, the region will be unavailable for new operations only for 
> the period of flushing MemStore, loading store files generated by MemStore 
> flushes, updating the meta table, and notifying HMaster. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to