virajjasani opened a new pull request, #6462:
URL: https://github.com/apache/hbase/pull/6462
Jira: HBASE-28638
Master initiated remote procedures are scheduled by RSProcedureDispatcher.
If it encounters specific errors on first retry (e.g. CallQueueTooBigException
or SaslException), it is guaranteed that the remote call has not reached the
regionserver, therefore the remote call is marked failed prompting the parent
procedure to select different target regionserver to resume the operation.
If the first attempt is successful, RSProcedureDispatcher continues with
infinite retries. We can encounter valid case (e.g. ConnectionClosedException)
which is halting the remote operation. Without manual intervention, it can
cause significant delay upto several minutes or hours to the
region-in-transition.
The purpose of this Jira is to impose retry limit for specific error types
such that if the retry limit is reached, the master can recover the state of
the ongoing remote call failure by initiating SCP (ServerCrashProcedure) on the
target server. The SCP is going to override the TRSP
(TransitRegionStateProcedure) if required. This can ensure that the target
server has no region hosted online before we suspend the ongoing TRSP.
Scheduling SCP for the target server will always lead to the regionserver in
stopped state. Either regionserver would be automatically stopped, or if the
regionserver is able to send the region report to master, master will reject
it, which will further lead to regionserver abort.
**Changes proposed:**
- Allow extending RSProcedureDispatcher
- RSProcedureDispatcher can impose retry limit for specific errors:
- CallQueueTooBigException
- SaslException
- ConnectionClosedException
- Default retry limit: 5
- If retry limit is exhausted, schedule recovery through server crash. Let
SCP override current procedure state.
- Tests for ConnectionClosedException
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]