[ 
https://issues.apache.org/jira/browse/HBASE-29135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17938339#comment-17938339
 ] 

Charles Connell edited comment on HBASE-29135 at 3/25/25 8:15 PM:
------------------------------------------------------------------

Running this in production at my company, we've seen a dramatic decrease in GC 
activity. Charts attached describing the garbage collection behavior of a 
single affected RegionServer, using ZGC. The deploy of this patch was on 3/19. 

 !Screenshot 2025-03-25 at 3.59.38 PM.png! 


was (Author: charlesconnell):
Running this in production at my company, we've seen a dramatic decrease in GC 
activity:

 !Screenshot 2025-03-25 at 3.59.38 PM.png! 

> ZStandard decompression can operate directly on ByteBuffs
> ---------------------------------------------------------
>
>                 Key: HBASE-29135
>                 URL: https://issues.apache.org/jira/browse/HBASE-29135
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Charles Connell
>            Assignee: Charles Connell
>            Priority: Minor
>              Labels: pull-request-available
>             Fix For: 3.0.0-beta-2, 2.6.3, 2.5.12
>
>         Attachments: Screenshot 2025-03-25 at 3.59.38 PM.png, 
> create-decompression-stream-zstd.html
>
>
> I've been thinking about ways to improve HBase's performance when reading 
> HFiles, and I believe there is significant opportunity. I look at many 
> RegionServer profile flamegraphs of my company's servers. A pattern that I've 
> discovered is that object allocation in a very hot code path is a performance 
> killer. The HFile decoding code makes some effort to avoid this, but it isn't 
> totally successful.
> Each time a block is decoded in {{HFileBlockDefaultDecodingContext}}, a new 
> {{DecompressorStream}} is allocated and used. This is a lot of allocation, 
> and the use of the streaming pattern requires copying every byte to be 
> decompressed more times than necessary. Each byte is copied from a 
> {{ByteBuff}} into a {{byte[]}}, then decompressed, then copied back to a 
> {{ByteBuff}}. For decompressors like 
> {{org.apache.hadoop.hbase.io.compress.zstd.ZstdDecompressor}} that only 
> operate on direct memory, two additional copies are introduced to move from a 
> {{byte[]}} to a direct NIO {{ByteBuffer}}, then back to a {{byte[]}}.
> Aside from the copies inherent in the decompression algorithm, the necessity 
> of copying from an compressed buffer to an uncompressed buffer, all of these 
> other copies can be avoided without sacrificing functionality. Along the way, 
> we'll also avoid allocating objects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to