Copilot commented on code in PR #8264:
URL: https://github.com/apache/hbase/pull/8264#discussion_r3290030910


##########
hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/AssignmentTestingUtil.java:
##########
@@ -49,12 +49,17 @@ private AssignmentTestingUtil() {
   }
 
   public static void waitForRegionToBeInTransition(final HBaseTestingUtil util,
-    final RegionInfo hri) throws Exception {
-    while (!getMaster(util).getAssignmentManager().isRegionInTransition(hri)) {
+    final RegionInfo hri) {
+    while (!isRegionInTransition(hri, getMaster(util).getAssignmentManager())) 
{
       Threads.sleep(10);

Review Comment:
   waitForRegionToBeInTransition has an unbounded busy-wait loop. If the region 
never enters RIT (or the master is unavailable), this will hang the test suite 
indefinitely. Please switch to a bounded wait (e.g., 
HBaseTestingUtil.waitFor/Waiter with a timeout and failure message) so tests 
fail fast instead of hanging.
   



##########
hbase-server/src/test/java/org/apache/hadoop/hbase/master/assignment/TestRegionSplit.java:
##########
@@ -155,6 +155,54 @@ public void testSplitTableRegion() throws Exception {
       regionInfoMap.get(tableRegions.get(1).getRegionInfo()));
   }
 
+  @Test
+  public void testRITWithSplitTableRegion() throws Exception {
+    final TableName tableName = TableName.valueOf(testMethodName);
+    final ProcedureExecutor<MasterProcedureEnv> procExec = 
getMasterProcedureExecutor();
+
+    RegionInfo[] regions =
+      MasterProcedureTestingUtility.createTable(procExec, tableName, null, 
columnFamilyName);
+    insertData(UTIL, tableName, rowCount, startRowNum, columnFamilyName);
+    int splitRowNum = startRowNum + rowCount / 2;
+    byte[] splitKey = Bytes.toBytes("" + splitRowNum);
+
+    assertNotNull(regions, "not able to find a splittable region");
+    assertEquals(1, regions.length, "not able to find a splittable region");
+    assertEquals(0,
+      
UTIL.getHBaseCluster().getMaster().getAssignmentManager().getRegionsInTransitionCount());
+
+    ServerName targetRS = 
UTIL.getHBaseCluster().getMaster().getAssignmentManager()
+      .getRegionStates().getRegionServerOfRegion(regions[0]);
+    // Split region of the table
+    long procId = procExec.submitProcedure(
+      new SplitTableRegionProcedure(procExec.getEnvironment(), regions[0], 
splitKey));
+    // Wait the completion
+    ProcedureTestingUtility.waitProcedure(procExec, procId);
+    ProcedureTestingUtility.assertProcNotFailed(procExec, procId);
+
+    assertEquals(2, UTIL.getHBaseCluster().getRegions(tableName).size(), "not 
able to split table");
+    assertEquals(0,
+      
UTIL.getHBaseCluster().getMaster().getAssignmentManager().getRegionsInTransitionCount());
+    // As there are only 3 RS, start one more RS before expiring one
+    UTIL.getHBaseCluster().startRegionServer();
+
+    // stop RS holding split parent
+    
UTIL.getHBaseCluster().getMaster().getServerManager().expireServer(targetRS);
+
+    // stop master
+    UTIL.getHBaseCluster().stopMaster(0);
+    UTIL.getHBaseCluster().waitOnMaster(0);
+    Thread.sleep(500);

Review Comment:
   In testRITWithSplitTableRegion, the master is stopped immediately after 
expireServer(targetRS). expireServer only submits the ServerCrashProcedure; it 
runs asynchronously, so stopping the master right away can race and make this 
test non-deterministically miss the buggy path (or become flaky). Capture the 
returned pid and wait until the SCP is at least running/past INITIALIZING (or 
another deterministic signal that the crash handling has started) before 
stopping the master.



##########
hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionInTransitionTracker.java:
##########
@@ -41,22 +41,20 @@ public class RegionInTransitionTracker {
 
   private final List<RegionState.State> ENABLE_TABLE_REGION_STATE = 
List.of(RegionState.State.OPEN);
 
+  // DO NOT USE containsKey() on regionInTransition map as 
MutableRegionInfo.COMPARATOR considers
+  // offline/split flags.

Review Comment:
   The new comment is misleading: this map is created with 
RegionInfo.COMPARATOR (not MutableRegionInfo.COMPARATOR), and 
RegionInfo.COMPARATOR compares the offline flag (it does not consider the split 
flag). Please update the comment to accurately describe why key lookups with 
containsKey/remove can fail when the RegionInfo offline flag differs between 
instances.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to