[
https://issues.apache.org/jira/browse/HBASE-29265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17946446#comment-17946446
]
Hernan Gelaf-Romer commented on HBASE-29265:
--------------------------------------------
I need to amend this, I no longer think that
RetriesExhaustedWithDeailtsExceptions can lead to a meta cache clearing
exception. I traced the code path a little deeper and realized it's likely
something else. I think that SocketTimeoutExceptions can manifest to the
client, even if we could throw an OperationTimeoutException. I think we're
encountering the race condition explained by the comment here:
https://github.com/apache/hbase/blob/a8ff965536fda48bbb6d1f77b53a55e43b8d9461/hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestClientOperationTimeout.java#L193
> RetriesExhaustedWithDetailsException can create a pathological feedback loop
> with multigets
> -------------------------------------------------------------------------------------------
>
> Key: HBASE-29265
> URL: https://issues.apache.org/jira/browse/HBASE-29265
> Project: HBase
> Issue Type: Improvement
> Reporter: Hernan Gelaf-Romer
> Assignee: Hernan Gelaf-Romer
> Priority: Major
>
> Similar to https://issues.apache.org/jira/browse/HBASE-27487
>
> RetriesExhaustedWithDetailsException currently obscures that the underlying
> exception(s) may be OperationTimeoutExceededException. Because of this, we
> can still run into situations where slow request can trigger a flood of meta
> cache clearing exceptions, and hotspot the meta table.
>
> We should update our exception handling logic to special case these
> exceptions, and explicitly check to see if the underlying root cause for the
> request failures was due to an operation timeout.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)