[jira] [Commented] (HBASE-28963) Updating "Table Machine Quota Factors" is too expensive

Ray Mattingly (Jira) Wed, 06 Nov 2024 09:08:04 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-28963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17896040#comment-17896040
 ]


Ray Mattingly commented on HBASE-28963:
---------------------------------------

I'm attaching a profile of the HMaster showing that we're spending virtually 
all cycles on fetching cluster metrics. We have some internal auditing that 
pointed to the callers as our RegionServers, and some debug logging added to 
RegionServers revealed these requests as originating from the following call 
stack:

!image-2024-11-06-12-06-44-317.png|width=961,height=598!
{noformat}
2024-11-05T21:22:21,024 [regionserver/na1-academic-nutty-snail:60020.Chore.1 
{}] INFO org.apache.hadoop.hbase.client.HBaseAdmin: getClusterMetrics call 
stack:
java.base/java.lang.Thread.getStackTrace(Thread.java:2450)
org.apache.hadoop.hbase.client.HBaseAdmin.getClusterMetrics(HBaseAdmin.java:2307)
org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.updateQuotaFactors(QuotaCache.java:402)
org.apache.hadoop.hbase.quotas.QuotaCache$QuotaRefresherChore.chore(QuotaCache.java:267)
org.apache.hadoop.hbase.ScheduledChore.run(ScheduledChore.java:161)
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572)
java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:358)
java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
org.apache.hadoop.hbase.JitterScheduledThreadPoolExecutorImpl$JitteredRunnableScheduledFuture.run(JitterScheduledThreadPoolExecutorImpl.java:107)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base/java.lang.Thread.run(Thread.java:1583){noformat}

> Updating "Table Machine Quota Factors" is too expensive
> -------------------------------------------------------
>
>                 Key: HBASE-28963
>                 URL: https://issues.apache.org/jira/browse/HBASE-28963
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.6.1
>            Reporter: Ray Mattingly
>            Assignee: Ray Mattingly
>            Priority: Major
>         Attachments: image-2024-11-06-12-06-44-317.png, 
> quota-refresh-hmaster.png
>
>
> My company is running Quotas across a few hundred clusters of varied size. 
> One cluster has hundreds of servers and tens of thousands of regions. We 
> noticed that the HMaster was quite busy for this cluster, and after some 
> investigation we realized that RegionServers were hammering the HMaster's 
> ClusterMetrics endpoint to facilitate the refreshing of table machine quota 
> factors.
> There are a few things that we could do here — in a perfect world, I think 
> the RegionServers would have a better P2P communication of the region states, 
> and whatever else, that is necessary to derive new quota factors. Relying 
> solely on the HMaster for this coordination creates a tricky bottleneck for 
> the horizontal scalability of clusters.
> That said, I think that a simpler and preferable initial step would be to 
> make our code a bit more cost conscious. At my company, for example, we don't 
> even define any table-scoped quotas. Without any table scoped quotas in the 
> cache, our cache could be much more thoughtful about the work that it chooses 
> to do on each refresh. So I'm proposing that we check [the size of the 
> tableQuotaCache 
> keyset|https://github.com/apache/hbase/blob/db3ba44a4c692d26e70b6030fc519e92fd79f638/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/QuotaCache.java#L418]
>  earlier, and use this inference to determine what ClusterMetrics we bother 
> to fetch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HBASE-28963) Updating "Table Machine Quota Factors" is too expensive

Reply via email to