[ https://issues.apache.org/jira/browse/HBASE-28328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Shubham Roy updated HBASE-28328: -------------------------------- Release Note: The JIRA adds a feature in RowCounter tool to count the various types of delete markers (DELETE_COLUMN, DELETE_FAMILY, DELETE_FAMILY_VERSION) and the number of rows containing at least one delete marker. The feature can be enabled by passing the flag --countDeleteMarkers as a CLI option. When the feature is enabled, raw scan is performed without FirstKeyOnlyFilter. > Add an option to count different types of Delete Markers in RowCounter > ---------------------------------------------------------------------- > > Key: HBASE-28328 > URL: https://issues.apache.org/jira/browse/HBASE-28328 > Project: HBase > Issue Type: Improvement > Components: mapreduce > Reporter: Himanshu Gwalani > Assignee: Shubham Roy > Priority: Minor > Labels: pull-request-available > > Add an option (count-delete-markers) to the > [RowCounter|https://github.com/apache/hbase/blob/8a9ad0736621fa1b00b5ae90529ca6065f88c67f/hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java#L240C62-L240C75] > tool to count the number of Delete Markers of all types, i.e. (DELETE, > DELETE_COLUMN, DELETE_FAMILY,DELETE_FAMILY_VERSION) > We already have such a feature within our internal implementation of > RowCounter and it's very useful. > Implementation Ideas: > 1. If the option is passed, initialize the empty job counters for all 4 types > of deletes. > 2. Within mapper, increase the respective delete counts while processing each > row. -- This message was sent by Atlassian Jira (v8.20.10#820010)