[ https://issues.apache.org/jira/browse/HBASE-28608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel Roudnitsky updated HBASE-28608: -------------------------------------- Environment: (was: Client meta operation timeout hbase.client.meta.operation.timeout will now default to the value of the end to end operation timeout hbase.client.operation.timeout. Previously, the meta operation timeout would default to 20 minutes (which is the default value of hbase.client.operation.timeout). ) > More sensible client meta operation timeout default > --------------------------------------------------- > > Key: HBASE-28608 > URL: https://issues.apache.org/jira/browse/HBASE-28608 > Project: HBase > Issue Type: Improvement > Components: Client > Affects Versions: 2.6.0, 2.4.17, 3.0.0-beta-1, 2.5.8 > Reporter: Daniel Roudnitsky > Assignee: Daniel Roudnitsky > Priority: Major > Labels: pull-request-available, timeout > > Documented behavior in the HBase reference for client meta operation timeout > {{hbase.client.meta.operation.timeout}} default is that it will be set to the > configured client operation timeout, but implementation is that it defaults > to the default client operation timeout of 20 minutes. I propose to set the > default meta operation to what is currently specified in the hbase reference: > From "Timeout settings" in the hbase reference: > {panel} > A higher-level timeout is hbase.client.operation.timeout which is valid for > each client call. When an RPC call fails for instance for a timeout due to > hbase.rpc.timeout it will be retried until hbase.client.operation.timeout is > reached. Client operation timeout for system tables can be fine tuned by > setting hbase.client.meta.operation.timeout configuration value. When this is > not set its value will use hbase.client.operation.timeout > {panel} > There seem to be two very different dependencies on meta operation timeout: > # End to end operation timeout for system table operations (2.x and 3) > # Timeout to acquire {{userRegionLock}} to initiate meta scan in > {{locateRegionInMeta}} (2.x blocking client only, see HBASE-28730 for more > detail/related work) > For case 1 I believe it makes sense from a user perspective that meta > operation timeout, which is meant to apply to a specific subset of > operations, will respect the 'general' operation timeout that is configured > if the more specific meta operation timeout is not explicitly set. > For case 2 blocking client, the default meta timeout value defeats the > purpose of the `userRegionLock` timeout if one has a typical setup where > {{hbase.client.operation.timeout}} << 20 minutes and > {{hbase.client.meta.operation.timeout}} is not explicitly set, which can lead > to operations taking much longer than the configured operation timeout to > actually timeout if there is e.g meta slowness and/or contention around > userRegionLock on 2.x, see HBASE-28730 for more detail/related work. -- This message was sent by Atlassian Jira (v8.20.10#820010)