[
https://issues.apache.org/jira/browse/HADOOP-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375234#comment-17375234
]
Anoop Sam John edited comment on HADOOP-17770 at 7/6/21, 5:38 AM:
------------------------------------------------------------------
Ran below perf test with an HBase usage
Have a 3 RS node cluster. Region Server node SKU is DS13 v2. Storage is
premium Block blob storage account.
Region Server Xmx = 32 GB
Created one table with 15 presplits ( So that 5 regions/RS)
Pumped 100 GB of data using HBase PE tool. Single column/row with 1 KB row
size.
Table is NOT major compacted. Every region is having 2 or 3 HFiles under it.
Doing random read PE tests (Every read req is a multi get, getting 300 rows).
Ran with different number of tests.
1st case is where we make sure 0% cache hit always (skipping the data caching)
2nd case is when RS is having file mode cache with cache size ~50% of data
size. The cache hit ration ~50%
Case #1
!Throughput_0 cache hit.png!
!AvgLatency_0 cache hit.png!
Case #2
!Throughput_50% cache hit.png!
!AvgLatency_50% cache hit.png!
was (Author: anoop.hbase):
Ran below perf test with an HBase usage
Have a 3 RS node cluster. Region Server node SKU is DS13 v2. Storage is
premium Block blob storage account.
Region Server Xmx = 32 GB
Created one table with 15 presplits ( So that 5 regions/RS)
Pumped 100 GB of data using HBase PE tool. Single column/row with 1 KB row
size.
Doing random read PE tests (Every read req is a multi get, getting 300 rows).
Ran with different number of tests.
1st case is where we make sure 0% cache hit always (skipping the data caching)
2nd case is when RS is having file mode cache with cache size ~50% of data
size. The cache hit ration ~50%
Case #1
!Throughput_0 cache hit.png!
!AvgLatency_0 cache hit.png!
Case #2
!Throughput_50% cache hit.png!
!AvgLatency_50% cache hit.png!
> WASB : Support disabling buffered reads in positional reads
> -----------------------------------------------------------
>
> Key: HADOOP-17770
> URL: https://issues.apache.org/jira/browse/HADOOP-17770
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: Anoop Sam John
> Assignee: Anoop Sam John
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: AvgLatency_0 cache hit.png, AvgLatency_50% cache
> hit.png, Throughput_0 cache hit.png, Throughput_50% cache hit.png
>
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> This is just like HADOOP-17038
> Right now it will do a seek to the position , read and then seek back to the
> old position. (As per the impl in the super class)
> In HBase kind of workloads we rely mostly on short preads. (like 64 KB size
> by default). So would be ideal to support a pure pos read API which will not
> even keep the data in a buffer but will only read the required data as what
> is asked for by the caller. (Not reading ahead more data as per the read size
> config)
> Allow an optional boolean config to be specified while opening file for read
> using which buffered pread can be disabled.
> FutureDataInputStreamBuilder openFile(Path path)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]