[ https://issues.apache.org/jira/browse/HBASE-28837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Evelyn Boland updated HBASE-28837: ---------------------------------- Description: Goal: Add a coprocessor to HBase that allows administrators to track high level statistics on the rows and cells in their HBase tables. Administrators can load this coprocessor into their RegionServers if they wish to gain more visibility into the shape of their data in HBase. At my day job, we've leveraged the statistics from this coprocessor to automatically configure more optimal block sizes and smarter compaction schedules for our fleet of nearly 200 HBase clusters. Context: Since HBase tables can store terabytes or even petabytes of data, HBase administrators often have incomplete information about the data stored in their HBase tables. Without a comprehensive understanding of the shape of their data, it can be difficult for administrators to configure clusters for a desired level of performance and/or reliability. Row statistics have the potential to supercharge HBase management. [Design doc|https://docs.google.com/document/d/1oaNAZUER5zO8yivmzRBVAMmL6r2cYiJn9YCbDe14LMw/edit#heading=h.nch5d72p27ex] was: Goal: Add a coprocessor to HBase that allows administrators to track high level statistics on the rows and cells in their HBase tables. Administrators can load this coprocessor into their RegionServers if they wish to gain more visibility into the shape of their data in HBase. At my day job, we've leveraged the statistics from this coprocessor to automatically configure more optimal block sizes and smarter compaction schedules for our fleet of nearly 200 HBase clusters. Context: Since HBase tables can store terabytes or even petabytes of data, HBase administrators often have incomplete information about the data stored in their HBase tables. Without a comprehensive understanding of the shape of their data, it can be difficult for administrators to configure clusters for a desired level of performance and/or reliability. Row statistics have the potential to supercharge HBase management. > Add row statistics coprocessor > ------------------------------ > > Key: HBASE-28837 > URL: https://issues.apache.org/jira/browse/HBASE-28837 > Project: HBase > Issue Type: Improvement > Affects Versions: 2.0.0, 3.0.0-beta-1 > Reporter: Evelyn Boland > Assignee: Evelyn Boland > Priority: Major > Labels: pull-request-available > > Goal: > Add a coprocessor to HBase that allows administrators to track high level > statistics on the rows and cells in their HBase tables. Administrators can > load this coprocessor into their RegionServers if they wish to gain more > visibility into the shape of their data in HBase. > At my day job, we've leveraged the statistics from this coprocessor to > automatically configure more optimal block sizes and smarter compaction > schedules for our fleet of nearly 200 HBase clusters. > Context: > Since HBase tables can store terabytes or even petabytes of data, HBase > administrators often have incomplete information about the data stored in > their HBase tables. Without a comprehensive understanding of the shape of > their data, it can be difficult for administrators to configure clusters for > a desired level of performance and/or reliability. Row statistics have the > potential to supercharge HBase management. > [Design > doc|https://docs.google.com/document/d/1oaNAZUER5zO8yivmzRBVAMmL6r2cYiJn9YCbDe14LMw/edit#heading=h.nch5d72p27ex] -- This message was sent by Atlassian Jira (v8.20.10#820010)