[ 
https://issues.apache.org/jira/browse/HBASE-28890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wellington Chevreuil updated HBASE-28890:
-----------------------------------------
    Affects Version/s: 2.5.10
                       2.6.1
                           (was: 2.6.0)

> RefCnt Leak error when caching index blocks at write time
> ---------------------------------------------------------
>
>                 Key: HBASE-28890
>                 URL: https://issues.apache.org/jira/browse/HBASE-28890
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0-beta-1, 2.7.0, 2.6.1, 2.5.10
>            Reporter: Wellington Chevreuil
>            Assignee: Wellington Chevreuil
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.0.0, 2.7.0, 2.6.2
>
>
> Following [~bbeaudreault] works from HBASE-27170 that added the (very useful) 
> refcount leak detector, we sometimes see these reports on some branch-2 based 
> deployments:
> {noformat}
> 2024-09-25 10:06:42,413 ERROR 
> org.apache.hbase.thirdparty.io.netty.util.ResourceLeakDetector: LEAK: 
> RefCnt.release() was not called before it's garbage-collected. See 
> https://netty.io/wiki/reference-counted-objects.html for more information.
> Recent access records:  
> Created at:
>         org.apache.hadoop.hbase.nio.RefCnt.<init>(RefCnt.java:59)
>         org.apache.hadoop.hbase.nio.RefCnt.create(RefCnt.java:54)
>         org.apache.hadoop.hbase.nio.ByteBuff.wrap(ByteBuff.java:550)
>         
> org.apache.hadoop.hbase.io.ByteBuffAllocator.allocate(ByteBuffAllocator.java:357)
>         
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.cloneUncompressedBufferWithHeader(HFileBlock.java:1153)
>         
> org.apache.hadoop.hbase.io.hfile.HFileBlock$Writer.getBlockForCaching(HFileBlock.java:1215)
>         
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.lambda$writeIndexBlocks$0(HFileBlockIndex.java:997)
>         java.base/java.util.Optional.ifPresent(Optional.java:178)
>         
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexWriter.writeIndexBlocks(HFileBlockIndex.java:996)
>         
> org.apache.hadoop.hbase.io.hfile.HFileWriterImpl.close(HFileWriterImpl.java:635)
>         
> org.apache.hadoop.hbase.regionserver.StoreFileWriter.close(StoreFileWriter.java:378)
>         
> org.apache.hadoop.hbase.regionserver.StoreFlusher.finalizeWriter(StoreFlusher.java:69)
>         
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher.flushSnapshot(DefaultStoreFlusher.java:74)
>         
> org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:831)
>         
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.flushCache(HStore.java:2033)
>         
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2878)
>         
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2620)
>         
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2592)
>         
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2462)
>         
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:602)
>         
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:572)
>         
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$1000(MemStoreFlusher.java:65)
>         
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:344)
> {noformat}
> It turns out that we always convert the block to a "on-heap" one, inside 
> LruBlockCache.cacheBlock, so when the index block is a SharedMemHFileBlock, 
> the blockForCaching instance in the code 
> [here|https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlockIndex.java#L1076]
>  becomes eligible for GC without releasing buffers/decreasing refcount 
> (leak), right after we return the BlockIndexWriter.writeIndexBlocks call.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to