[
https://issues.apache.org/jira/browse/HBASE-29103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17923912#comment-17923912
]
Becker Ewing commented on HBASE-29103:
--------------------------------------
I've gotten around to benchmarking master vs. this patch and I wanted to post
the results here.
I was primarily interested in how two paths that are reverse scan heavy would
behave:
# A long reverse scan on rows with moderately sized values (90 bytes)
# Random meta scans
_Note: these are the same paths that were heavily tested when overhauling the
reverse scan path to be faster when done over storefiles with a non-RIV1 DBE in
HBASE-28043_
*1. Reverse Scans Over Moderately Sized Values*
I hypothesize that this patch will help the most for reverse scans over rows
with moderately sized values. To set this test environment up, I did the
following:
1. Start a local HBase cluster:
{code:java}
hbase master start --localRegionServers=1 {code}
I used the default settings/properties as configured on the master branch.
2. Prepare the table state for benchmarking:
{code:java}
hbase pe --nomapred=true --valueSize=90 --blockEncoding=PREFIX --compress=GZ
randomWrite 1 {code}
3. Flush and compact the table under test to ensure that subsequent tests are
operating over an equivalent single storefile (& the memstore doesn't influence
results):
{code:java}
$ hbase shell
> flush ‘TestTable’
> major_compact ‘TestTable’ {code}
4. For actual benchmarking, we'll be running the following command and
recording the results
{code:java}
hbase pe --nomapred=true reverseScan 1 {code}
Typically the first test is a bit slower as blocks are decompressed in addition
to the actual scanning logic. I run 3 tests, 1 cold and 2 directly afterwards
on the hot cache.
I got the following results:
||Benchmark||Revision||Time (s)||Throughput (rows/sec)||Throughput (MB/s)||
|Reverse Scan Run #1|master|15.435|67935|66.86|
|Reverse Scan Run #2|master|14.213|73776|72.61|
|Reverse Scan Run #3|master|13.650|76819|75.60|
|Reverse Scan Run #1|patch|14.536|72136|71.00|
|Reverse Scan Run #2|patch|13.395|78281|77.04|
|Reverse Scan Run #3|patch|13.586|77181|75.96|
I think it's clear here that this patch absolutely gives a little improvement
to this case which is nice.
*2. Random Meta Scans*
Since we touch the reverse scan path here, I think it's only appropriate to
benchmark the system critical hbase:meta read path. To set this test
environment up, I did the following:
1. Start a local HBase cluster:
{code:java}
hbase master start --localRegionServers=1 {code}
I used the default settings/properties as configured on the master branch.
2. Prepare the meta table for benchmarking:
{code:java}
hbase pe --nomapred=true metaWrite 1 {code}
3. Flush and compact the hbase:meta table to ensure that subsequent tests are
operating over an equivalent single storefile (& the memstore doesn't influence
results):
{noformat}
$ hbase shell
> flush ‘hbase:meta’
> major_compact ‘hbase:meta’ {noformat}
4. For actual benchmarking, we'll be running the following command and
recording the results:
{code:java}
hbase pe --nomapred=true metaRandomRead 10 {code}
This test takes so long and I don't have quite as much patience. Given it's
long run time, I only ran 1 test and assumed that there would be less
variability of results given the long runtime. This assumption could be wrong,
but I encourage follow up verification.
I got the following results:
||Benchmark||Revision||Time (s)||Throughput (rows/sec)||Avg Latency (us)||
|Meta Random Read #1|master|643.407|16311|613|
|Meta Random Read #1|patch|633.051|16592|601|
Which does show a bit of improvement between the master and patch (about the
projected 2-3%).
The above test state writes a ton of junk to the hbase:meta table. The clean up
the meta table afterwards you can run:
{code:java}
hbase pe --nomapred=true cleanMeta 1 {code}
*Conclusion*
This isn't a groundbreaking performance improvement, but the patch does show
the theoretical incremental 2-3% improvement over master that memory profiling
showed was possible. Region Servers on this patch will allocate less on the
heap which will _always_ make a Java application run faster. Mileage will vary
and this improvement will be less noticeable in a Region Server which has it's
memory, heap, and garbage collector settings tuned correctly. However, in a
region server running closer to the edge this will give a small improvement.
Additionally, this will help clusters that are running regular workloads in
addition to reverse scan workloads since this impacts the meta read path.
One result I can't explain is how my Meta Random Read benchmarks taken in this
patch/master are so much better than those taken just a year and a half ago in
HBASE-28043. Obviously, the reverse scan benchmarks can't be compared
apples-to-apples given that we're using different value sizes, however, the
meta random read tests should be directly comparable. Given that, these tests
are 25% faster for average read latency which is amazing, I just find it hard
to explain especially given that I haven't upgraded hardware in that time. I
think the biggest difference is maybe that underlying JDK that I'm running with
21 vs. 11.
> Avoid excessive allocations during reverse scanning when seeking to next row
> ----------------------------------------------------------------------------
>
> Key: HBASE-29103
> URL: https://issues.apache.org/jira/browse/HBASE-29103
> Project: HBase
> Issue Type: Improvement
> Components: Performance
> Affects Versions: 3.0.0-beta-1, 2.6.1
> Reporter: Becker Ewing
> Assignee: Becker Ewing
> Priority: Major
> Attachments: high-block-cache-key-to-string-alloc-profile.html
>
>
> Currently, when we're reverse scanning in a storefile, the general path is to:
> # Seek to before the current row to find the prior row
> # Seek to the beginning of the prior row
> (this can get a big more complex depending on how fast a single "seek"
> operation is, see HBASE-28043 for additional details).
>
> At step 1, we call HFileScanner#getCell and then we subsequently always call
> PrivateCellUtil.createFirstOnRow() on this Cell instance
> ([Code).|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileScanner.java#L611-L614]
> PrivateCellUtil.createFirstOnRow() creates a [copy of only the row portion
> of this
> Cell|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-common/src/main/java/org/apache/hadoop/hbase/PrivateCellUtil.java#L2768-L2775].
>
>
> I propose that since we're only using the key-portion of the cell returned by
> HFileScanner#getCell, that we should instead call
> [HFileScanner#getKey|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileScanner.java#L91-L96]
> in this scenario so we avoid deep-copying extra components of the Cell such
> as the value, tags, etc... This should be a safe change as this Cell instance
> never escapes StoreFileScanner and we only call HFileScanner#getCell when the
> scanner is already seeked.
>
> Attached is the same allocation profile taken to guide the optimizations in
> HBASE-29099 which shows that about 3% of allocations are spent in
> [BufferedEncodedSeeker.getCell in the body of
> seekBeforeAndSaveKeyToPreviousRow|https://github.com/apache/hbase/blob/b89c8259c5726395c9ae3a14919bd192252ca517/hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java#L284-L348].
> The region server in question here was pinned at 100% CPU utilization for a
> while and was running a reverse-scan heavy workload.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)