[jira] [Comment Edited] (HADOOP-17770) WASB : Support disabling buffered reads in positional reads

Anoop Sam John (Jira) Mon, 05 Jul 2021 22:39:15 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-17770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375234#comment-17375234
 ]


Anoop Sam John edited comment on HADOOP-17770 at 7/6/21, 5:38 AM:
------------------------------------------------------------------

Ran below perf test with an HBase usage
Have a 3 RS node cluster.  Region Server node SKU is DS13 v2.  Storage is 
premium Block blob storage account.
Region Server Xmx = 32 GB
Created one table with 15 presplits ( So that 5 regions/RS)
Pumped 100 GB of data using HBase PE tool.  Single column/row with 1 KB row 
size.
Table is NOT major compacted.  Every region is having 2 or 3 HFiles under it.
Doing random read PE tests (Every read req is a multi get, getting 300 rows).  
Ran with different number of tests.
1st case is where we make sure 0% cache hit always (skipping the data caching)
2nd case is when RS is having file mode cache with cache size ~50% of data 
size.  The cache hit ration ~50%

Case #1
 !Throughput_0 cache hit.png! 
 !AvgLatency_0 cache hit.png! 

Case #2
 !Throughput_50% cache hit.png! 
 !AvgLatency_50% cache hit.png! 



was (Author: anoop.hbase):
Ran below perf test with an HBase usage
Have a 3 RS node cluster.  Region Server node SKU is DS13 v2.  Storage is 
premium Block blob storage account.
Region Server Xmx = 32 GB
Created one table with 15 presplits ( So that 5 regions/RS)
Pumped 100 GB of data using HBase PE tool.  Single column/row with 1 KB row 
size.
Doing random read PE tests (Every read req is a multi get, getting 300 rows).  
Ran with different number of tests.
1st case is where we make sure 0% cache hit always (skipping the data caching)
2nd case is when RS is having file mode cache with cache size ~50% of data 
size.  The cache hit ration ~50%

Case #1
 !Throughput_0 cache hit.png! 
 !AvgLatency_0 cache hit.png! 

Case #2
 !Throughput_50% cache hit.png! 
 !AvgLatency_50% cache hit.png! 


> WASB : Support disabling buffered reads in positional reads
> -----------------------------------------------------------
>
>                 Key: HADOOP-17770
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17770
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>         Attachments: AvgLatency_0 cache hit.png, AvgLatency_50% cache 
> hit.png, Throughput_0 cache hit.png, Throughput_50% cache hit.png
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> This is just like HADOOP-17038
> Right now it will do a seek to the position , read and then seek back to the 
> old position. (As per the impl in the super class)
> In HBase kind of workloads we rely mostly on short preads. (like 64 KB size 
> by default). So would be ideal to support a pure pos read API which will not 
> even keep the data in a buffer but will only read the required data as what 
> is asked for by the caller. (Not reading ahead more data as per the read size 
> config)
> Allow an optional boolean config to be specified while opening file for read 
> using which buffered pread can be disabled.
> FutureDataInputStreamBuilder openFile(Path path)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (HADOOP-17770) WASB : Support disabling buffered reads in positional reads

Reply via email to