[
https://issues.apache.org/jira/browse/HBASE-29933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dev Hingu updated HBASE-29933:
------------------------------
Description:
update_all_config command hangs indefinitely if HMaster.balance() event is
going on.
When HMaster.balance() is running it acquires lock on CacheAwareLoadBalancer
and HMaster thread goes to sleep due to throttling in CacheAwareLoadBalancer.
Now, CacheAwareLoadBalancer.onConfigurationChange() waits to acquire the same
lock
Attaching stack traces for both thread
1. HMaster Thread :
{code:java}
#355 daemon prio=5 os_prio=0 cpu=2014.35ms elapsed=19554.89s
tid=0x00007f3fd018b310 nid=0x5fda waiting on condition [0x00007f3f6a3f1000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep([email protected]/Native Method)
at
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer.throttle(CacheAwareLoadBalancer.java:197)
at
org.apache.hadoop.hbase.master.HMaster.executeRegionPlansWithThrottling(HMaster.java:2164)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:2122)
- locked <0x000000070b7d1438> (a
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1998)
at
org.apache.hadoop.hbase.master.HMaster.balanceOrUpdateMetrics(HMaster.java:2010)
- locked <0x000000070b7d1438> (a
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
at
org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:47)
at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:161){code}
2. update configuration RPC thread :
{code:java}
#96 daemon prio=5 os_prio=0 cpu=523.03ms elapsed=19854.17s
tid=0x00007f3ffbb47ed0 nid=0x48a4 waiting for monitor entry
[0x00007f3f785fe000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.onConfigurationChange(BaseLoadBalancer.java:785)
- waiting to lock <0x000000070b7d1438> (a
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
at
org.apache.hadoop.hbase.conf.ConfigurationManager.notifyAllObservers(ConfigurationManager.java:110)
- locked <0x000000070c9e2440> (a java.util.Collections$SetFromMap)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.updateConfiguration(HRegionServer.java:3927)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.updateConfiguration(RSRpcServices.java:3902)
{code}
was:
update_all_config command hangs indefinitely if HMaster.balance() event is
going on.
When HMaster.balance() is running it acquires lock on CacheAwareLoadBalancer
and HMaster thread goes to sleep due to throttling in CacheAwareLoadBalancer.
Now, CacheAwareLoadBalancer.onConfigurationChange() waits to acquire the same
lock
Attaching stack traces for both thread
1. HMaster Thread :
{code:java}
#355 daemon prio=5 os_prio=0 cpu=2014.35ms elapsed=19554.89s
tid=0x00007f3fd018b310 nid=0x5fda waiting on condition [0x00007f3f6a3f1000]
java.lang.Thread.State: TIMED_WAITING (sleeping)
at java.lang.Thread.sleep([email protected]/Native Method)
at
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer.throttle(CacheAwareLoadBalancer.java:197)
at
org.apache.hadoop.hbase.master.HMaster.executeRegionPlansWithThrottling(HMaster.java:2164)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:2122)
- locked <0x000000070b7d1438> (a
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1998)
at
org.apache.hadoop.hbase.master.HMaster.balanceOrUpdateMetrics(HMaster.java:2010)
- locked <0x000000070b7d1438> (a
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
at
org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:47)
at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:161){code}
2. update configuration RPC thread :
{code:java}
#96 daemon prio=5 os_prio=0 cpu=523.03ms elapsed=19854.17s
tid=0x00007f3ffbb47ed0 nid=0x48a4 waiting for monitor entry
[0x00007f3f785fe000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.onConfigurationChange(BaseLoadBalancer.java:785)
- waiting to lock <0x000000070b7d1438> (a
org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
at
org.apache.hadoop.hbase.conf.ConfigurationManager.notifyAllObservers(ConfigurationManager.java:110)
- locked <0x000000070c9e2440> (a java.util.Collections$SetFromMap)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.updateConfiguration(HRegionServer.java:3927)
at
org.apache.hadoop.hbase.regionserver.RSRpcServices.updateConfiguration(RSRpcServices.java:3902)
{code}
> update_all_config hangs indefinitely when balancing event is in progress
> ------------------------------------------------------------------------
>
> Key: HBASE-29933
> URL: https://issues.apache.org/jira/browse/HBASE-29933
> Project: HBase
> Issue Type: Bug
> Reporter: Dev Hingu
> Assignee: Dev Hingu
> Priority: Major
>
> update_all_config command hangs indefinitely if HMaster.balance() event is
> going on.
> When HMaster.balance() is running it acquires lock on CacheAwareLoadBalancer
> and HMaster thread goes to sleep due to throttling in CacheAwareLoadBalancer.
> Now, CacheAwareLoadBalancer.onConfigurationChange() waits to acquire the same
> lock
> Attaching stack traces for both thread
> 1. HMaster Thread :
> {code:java}
> #355 daemon prio=5 os_prio=0 cpu=2014.35ms elapsed=19554.89s
> tid=0x00007f3fd018b310 nid=0x5fda waiting on condition [0x00007f3f6a3f1000]
> java.lang.Thread.State: TIMED_WAITING (sleeping)
> at java.lang.Thread.sleep([email protected]/Native Method)
> at
> org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer.throttle(CacheAwareLoadBalancer.java:197)
> at
> org.apache.hadoop.hbase.master.HMaster.executeRegionPlansWithThrottling(HMaster.java:2164)
> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:2122)
> - locked <0x000000070b7d1438> (a
> org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
> at org.apache.hadoop.hbase.master.HMaster.balance(HMaster.java:1998)
> at
> org.apache.hadoop.hbase.master.HMaster.balanceOrUpdateMetrics(HMaster.java:2010)
> - locked <0x000000070b7d1438> (a
> org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
> at
> org.apache.hadoop.hbase.master.balancer.BalancerChore.chore(BalancerChore.java:47)
> at org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:161){code}
> 2. update configuration RPC thread :
> {code:java}
> #96 daemon prio=5 os_prio=0 cpu=523.03ms elapsed=19854.17s
> tid=0x00007f3ffbb47ed0 nid=0x48a4 waiting for monitor entry
> [0x00007f3f785fe000]
> java.lang.Thread.State: BLOCKED (on object monitor)
> at
> org.apache.hadoop.hbase.master.balancer.BaseLoadBalancer.onConfigurationChange(BaseLoadBalancer.java:785)
>
> - waiting to lock <0x000000070b7d1438> (a
> org.apache.hadoop.hbase.master.balancer.CacheAwareLoadBalancer)
> at
> org.apache.hadoop.hbase.conf.ConfigurationManager.notifyAllObservers(ConfigurationManager.java:110)
>
> - locked <0x000000070c9e2440> (a java.util.Collections$SetFromMap)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.updateConfiguration(HRegionServer.java:3927)
>
> at
> org.apache.hadoop.hbase.regionserver.RSRpcServices.updateConfiguration(RSRpcServices.java:3902)
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)