gyz-web opened a new pull request, #7602:
URL: https://github.com/apache/hadoop/pull/7602
When we use Router to forward read requests to the observer, if the
cluster experiences heavy write workloads, Observer nodes may fail to keep pace
with edit log synchronization, even if the dfs.ha.tail-edits.in-progress
parameter is configured, it may still occur.This triggers RetriableException:
Observer Node is too far behind errors. Especially when the client
ipc.client.ping parameter is set to true, it will strive to wait and constantly
retry, which can cause the business to be unable to obtain the desired data
timely. We should consider having the active namenode handle this at this time.
Here are our some errors and repair verification:
1.The stateid of the observer is too far behind the active:

2.RetriableException:

3.repair verification:

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]