[ https://issues.apache.org/jira/browse/HBASE-28608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Duo Zhang updated HBASE-28608: ------------------------------ Fix Version/s: 2.7.0 3.0.0-beta-2 Hadoop Flags: Incompatible change,Reviewed Resolution: Fixed Status: Resolved (was: Patch Available) Pushed to branch-2+. Thanks [~droudy] for contributing! > More sensible client meta operation timeout default > --------------------------------------------------- > > Key: HBASE-28608 > URL: https://issues.apache.org/jira/browse/HBASE-28608 > Project: HBase > Issue Type: Improvement > Components: Client > Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.8 > Reporter: Daniel Roudnitsky > Assignee: Daniel Roudnitsky > Priority: Major > Labels: pull-request-available, timeout > Fix For: 2.7.0, 3.0.0-beta-2 > > > Documented behavior in the HBase reference for client meta operation timeout > {{hbase.client.meta.operation.timeout}} default is that it will be set to the > configured client operation timeout, but implementation is that it defaults > to the default client operation timeout of 20 minutes. I propose to set the > default meta operation to what is currently specified in the hbase reference: > From "Timeout settings" in the hbase reference: > {panel} > A higher-level timeout is hbase.client.operation.timeout which is valid for > each client call. When an RPC call fails for instance for a timeout due to > hbase.rpc.timeout it will be retried until hbase.client.operation.timeout is > reached. Client operation timeout for system tables can be fine tuned by > setting hbase.client.meta.operation.timeout configuration value. When this is > not set its value will use hbase.client.operation.timeout > {panel} > There seem to be two very different dependencies on meta operation timeout: > # End to end operation timeout for system table operations (2.x and 3) > # Timeout to acquire {{userRegionLock}} to initiate meta scan in > {{locateRegionInMeta}} (2.x blocking client only, see HBASE-28730 for more > detail/related work) > For case 1 I believe it makes sense from a user perspective that meta > operation timeout, which is meant to apply to a specific subset of > operations, will respect the 'general' operation timeout that is configured > if the more specific meta operation timeout is not explicitly set. > For case 2 blocking client, the default meta timeout value defeats the > purpose of the `userRegionLock` timeout if one has a typical setup where > {{hbase.client.operation.timeout}} << 20 minutes and > {{hbase.client.meta.operation.timeout}} is not explicitly set, which can lead > to operations taking much longer than the configured operation timeout to > actually timeout if there is e.g meta slowness and/or contention around > userRegionLock on 2.x, see HBASE-28730 for more detail/related work. -- This message was sent by Atlassian Jira (v8.20.10#820010)