[jira] [Updated] (HBASE-28837) Add row statistics coprocessor

Evelyn Boland (Jira) Tue, 15 Oct 2024 06:27:05 -0700


     [ 
https://issues.apache.org/jira/browse/HBASE-28837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Evelyn Boland updated HBASE-28837:
----------------------------------
    Description: 
Goal:

Add a coprocessor to HBase that allows administrators to track high level 
statistics on the rows and cells in their HBase tables. Administrators can load 
this coprocessor into their RegionServers if they wish to gain more visibility 
into the shape of their data in HBase.

At my day job, we've leveraged the statistics from this coprocessor to 
automatically configure more optimal block sizes and smarter compaction 
schedules for our fleet of nearly 200 HBase clusters.

Context:

Since HBase tables can store terabytes or even petabytes of data, HBase 
administrators often have incomplete information about the data stored in their 
HBase tables. Without a comprehensive understanding of the shape of their data, 
it can be difficult for administrators to configure clusters for a desired 
level of performance and/or reliability. Row statistics have the potential to 
supercharge HBase management.

[Design 
doc|https://docs.google.com/document/d/1oaNAZUER5zO8yivmzRBVAMmL6r2cYiJn9YCbDe14LMw/edit#heading=h.nch5d72p27ex]

  was:
Goal:

Add a coprocessor to HBase that allows administrators to track high level 
statistics on the rows and cells in their HBase tables. Administrators can load 
this coprocessor into their RegionServers if they wish to gain more visibility 
into the shape of their data in HBase.

At my day job, we've leveraged the statistics from this coprocessor to 
automatically configure more optimal block sizes and smarter compaction 
schedules for our fleet of nearly 200 HBase clusters.

Context:

Since HBase tables can store terabytes or even petabytes of data, HBase 
administrators often have incomplete information about the data stored in their 
HBase tables. Without a comprehensive understanding of the shape of their data, 
it can be difficult for administrators to configure clusters for a desired 
level of performance and/or reliability. Row statistics have the potential to 
supercharge HBase management.


> Add row statistics coprocessor
> ------------------------------
>
>                 Key: HBASE-28837
>                 URL: https://issues.apache.org/jira/browse/HBASE-28837
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 2.0.0, 3.0.0-beta-1
>            Reporter: Evelyn Boland
>            Assignee: Evelyn Boland
>            Priority: Major
>              Labels: pull-request-available
>
> Goal:
> Add a coprocessor to HBase that allows administrators to track high level 
> statistics on the rows and cells in their HBase tables. Administrators can 
> load this coprocessor into their RegionServers if they wish to gain more 
> visibility into the shape of their data in HBase.
> At my day job, we've leveraged the statistics from this coprocessor to 
> automatically configure more optimal block sizes and smarter compaction 
> schedules for our fleet of nearly 200 HBase clusters.
> Context:
> Since HBase tables can store terabytes or even petabytes of data, HBase 
> administrators often have incomplete information about the data stored in 
> their HBase tables. Without a comprehensive understanding of the shape of 
> their data, it can be difficult for administrators to configure clusters for 
> a desired level of performance and/or reliability. Row statistics have the 
> potential to supercharge HBase management.
> [Design 
> doc|https://docs.google.com/document/d/1oaNAZUER5zO8yivmzRBVAMmL6r2cYiJn9YCbDe14LMw/edit#heading=h.nch5d72p27ex]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (HBASE-28837) Add row statistics coprocessor

Reply via email to