[
https://issues.apache.org/jira/browse/HBASE-30115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
JinHyuk Kim updated HBASE-30115:
--------------------------------
Description:
h1. Background
Currently, {{TableRecordReaderImpl.getProgress()}} always returns {*}0{*},
providing no progress feedback to the MapReduce framework. This makes it
impossible for users to monitor scan progress during long-running jobs.
!mapreduce-progress-0.png|width=1095,height=236!
h1. Suggestion
This patch estimates progress by converting row keys to numeric values and
computing the fraction of the key space covered so far: {{{}(current - start) /
(stop - start){}}}.
Since the {{TableInputFormat}} splitter sets start/stop row keys from region
boundaries, they are only empty for the table's very first region (empty start)
or last region (empty stop). In those cases, we *probe* the table with a
forward or reverse scan (limit 1) to discover the actual boundary row key.
The implementation is pluggable via {{hbase.mapreduce.rowkey.progress.class}}
configuration:
* {{ByteBasedRowKeyProgress}} (default) : treats row keys as raw bytes. Works
well for most key designs.
* {{HexPrefixRowKeyProgress}} : interprets leading bytes as hex characters
([0-9a-f]). Gives accurate linear progress for tables using hex-encoded hash
prefixes (e.g. MD5). The raw byte approach is inaccurate for hex keys because
there are large byte gaps between '9'→'a' (0x39→0x61) and between "0f"→"10"
(0x3066→0x3130) that don't correspond to actual key distance. The prefix length
is configurable via {{hbase.mapreduce.rowkey.progress.hex.prefix.length}}
(default 4). Bytes beyond the prefix are ignored, so non-hex suffixes do not
affect progress.
* Users can implement the {{RowKeyProgress}} interface for custom key encoding
strategies.
After this change, you can monitor the progress in this way.
!mapreduce-progress-after.png|width=1792,height=119!
h2. Why a pluggable estimator (and the hex variant) is needed
The default {{ByteBasedRowKeyProgress}} assumes row keys span the full
0x00–0xFF byte range. But hex-encoded hash prefixes (MD5/SHA, the most common
salting scheme) only use {{0–9a–f}}. The byte gap between {{'9' (0x39)}} and
{{'a' (0x61)}} contains 39 byte values that no key ever occupies, so byte-level
interpolation is wildly non-linear.
Concrete example: scan from {{09}} to {{a1}} (see attached graph):
|| Real progress || {{ByteBased}} || {{HexPrefix}} ||
| 50% (key {{50}}) | ~10% | ~47% |
| 88% (key {{90}}) | ~18% | ~89% |
| 99% (key {{a0}}) | ~100% | ~99% |
{{ByteBased}} stays under 20% for nearly the whole job, then snaps to 100% the
instant the scan crosses into {{a*}}. This breaks YARN progress bars, ETA
estimation.
was:
h1. Background
Currently, {{TableRecordReaderImpl.getProgress()}} always returns {*}0{*},
providing no progress feedback to the MapReduce framework. This makes it
impossible for users to monitor scan progress during long-running jobs.
!mapreduce-progress-0.png|width=1095,height=236!
h1. Suggestion
This patch estimates progress by converting row keys to numeric values and
computing the fraction of the key space covered so far: {{{}(current - start) /
(stop - start){}}}.
Since the {{TableInputFormat}} splitter sets start/stop row keys from region
boundaries, they are only empty for the table's very first region (empty start)
or last region (empty stop). In those cases, we *probe* the table with a
forward or reverse scan (limit 1) to discover the actual boundary row key.
The implementation is pluggable via {{hbase.mapreduce.rowkey.progress.class}}
configuration:
* {{ByteBasedRowKeyProgress}} (default) : treats row keys as raw bytes. Works
well for most key designs.
* {{HexPrefixRowKeyProgress}} : interprets leading bytes as hex characters
([0-9a-f]). Gives accurate linear progress for tables using hex-encoded hash
prefixes (e.g. MD5). The raw byte approach is inaccurate for hex keys because
there are large byte gaps between '9'→'a' (0x39→0x61) and between "0f"→"10"
(0x3066→0x3130) that don't correspond to actual key distance. The prefix length
is configurable via {{hbase.mapreduce.rowkey.progress.hex.prefix.length}}
(default 4). Bytes beyond the prefix are ignored, so non-hex suffixes do not
affect progress.
* Users can implement the {{RowKeyProgress}} interface for custom key encoding
strategies.
After this change, you can monitor the progress in this way.
!mapreduce-progress-after.png|width=1792,height=119!
> Introduce approximate progress estimation for TableRecordReader based on row
> key position
> -----------------------------------------------------------------------------------------
>
> Key: HBASE-30115
> URL: https://issues.apache.org/jira/browse/HBASE-30115
> Project: HBase
> Issue Type: Task
> Components: mapreduce
> Reporter: JinHyuk Kim
> Assignee: JinHyuk Kim
> Priority: Minor
> Labels: pull-request-available
> Attachments: byte-based-vs-hex.png, mapreduce-progress-0.png,
> mapreduce-progress-after.png
>
>
> h1. Background
> Currently, {{TableRecordReaderImpl.getProgress()}} always returns {*}0{*},
> providing no progress feedback to the MapReduce framework. This makes it
> impossible for users to monitor scan progress during long-running jobs.
> !mapreduce-progress-0.png|width=1095,height=236!
>
> h1. Suggestion
> This patch estimates progress by converting row keys to numeric values and
> computing the fraction of the key space covered so far: {{{}(current - start)
> / (stop - start){}}}.
> Since the {{TableInputFormat}} splitter sets start/stop row keys from region
> boundaries, they are only empty for the table's very first region (empty
> start) or last region (empty stop). In those cases, we *probe* the table with
> a forward or reverse scan (limit 1) to discover the actual boundary row key.
> The implementation is pluggable via {{hbase.mapreduce.rowkey.progress.class}}
> configuration:
> * {{ByteBasedRowKeyProgress}} (default) : treats row keys as raw bytes.
> Works well for most key designs.
> * {{HexPrefixRowKeyProgress}} : interprets leading bytes as hex characters
> ([0-9a-f]). Gives accurate linear progress for tables using hex-encoded hash
> prefixes (e.g. MD5). The raw byte approach is inaccurate for hex keys because
> there are large byte gaps between '9'→'a' (0x39→0x61) and between "0f"→"10"
> (0x3066→0x3130) that don't correspond to actual key distance. The prefix
> length is configurable via
> {{hbase.mapreduce.rowkey.progress.hex.prefix.length}} (default 4). Bytes
> beyond the prefix are ignored, so non-hex suffixes do not affect progress.
> * Users can implement the {{RowKeyProgress}} interface for custom key
> encoding strategies.
> After this change, you can monitor the progress in this way.
>
> !mapreduce-progress-after.png|width=1792,height=119!
>
> h2. Why a pluggable estimator (and the hex variant) is needed
> The default {{ByteBasedRowKeyProgress}} assumes row keys span the full
> 0x00–0xFF byte range. But hex-encoded hash prefixes (MD5/SHA, the most common
> salting scheme) only use {{0–9a–f}}. The byte gap between {{'9' (0x39)}} and
> {{'a' (0x61)}} contains 39 byte values that no key ever occupies, so
> byte-level interpolation is wildly non-linear.
>
>
>
>
> Concrete example: scan from {{09}} to {{a1}} (see attached graph):
> || Real progress || {{ByteBased}} || {{HexPrefix}} ||
> | 50% (key {{50}}) | ~10% | ~47% |
> | 88% (key {{90}}) | ~18% | ~89% |
> | 99% (key {{a0}}) | ~100% | ~99% |
> {{ByteBased}} stays under 20% for nearly the whole job, then snaps to 100%
> the instant the scan crosses into {{a*}}. This breaks YARN progress bars, ETA
> estimation.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)