[
https://issues.apache.org/jira/browse/HBASE-29862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055386#comment-18055386
]
guluo commented on HBASE-29862:
-------------------------------
Maybe I have found the resone of the test failure.
TheBucketCache relies on BucketCache.WriterThread to write blocks into the
cache asynchronously.
According to the code, a possible scenario: the blockNumber has already been
incremented, but the actual block has not yet been written into the BucketCache
(since the WriterThread still needs to consume the queue to perform the actual
write).
BlockingQueue<RAMQueueEntry> bq = writerQueues.get(queueNum);
successfulAddition = bq.offer(re)
if (!successfulAddition) {
LOG.debug("Failed to insert block {} into the cache writers queue", cacheKey);
ramCache.remove(cacheKey);
cacheStats.failInsert();
} else {
this.blockNumber.increment(); // Incremented here, but not yet written to
BucketCache
this.heapSize.add(cachedItem.heapSize());
}
So if clearRegionBlockCache(rs1) is executed immediately after a scan, and the
block is still sitting in the writerQueues and hasn't been processed by the
WriterThread.
In this case, getAllCacheKeysForFile will fail to find the block, causing the
clear cache operation to terminate early. The logs is shown as below:
{code:java}
2026-01-30T17:03:43,454 DEBUG [Time-limited test] bucket.BucketCache(1937):
found 0 blocks for file 983b4b768b57459ea3804e9b3d0353b2, starting offset: 0,
end offset: 9223372036854775807 {code}
public int evictBlocksRangeByHfileName(String hfileName, long initOffset, long
endOffset) {
Set<BlockCacheKey> keySet = getAllCacheKeysForFile(hfileName, initOffset,
endOffset);
LOG.debug("found {} blocks for file {}, starting offset: {}, end offset: {}",
keySet.size(),
hfileName, initOffset, endOffset);
int numEvicted = 0;
for (BlockCacheKey key : keySet) {
if (evictBlock(key)) {
++numEvicted;
}
}
return numEvicted;
}
> Test case TestClearRegionBlockCache#testClearBlockCache failed
> --------------------------------------------------------------
>
> Key: HBASE-29862
> URL: https://issues.apache.org/jira/browse/HBASE-29862
> Project: HBase
> Issue Type: Bug
> Components: BlockCache, test
> Reporter: guluo
> Priority: Major
> Attachments: image-2026-01-30-00-14-28-763.png
>
>
> When reviewing the PR [https://github.com/apache/hbase/pull/7684], I noticed
> that test case TestClearRegionBlockCache#testClearBlockCache failed. (Details
> see:
> https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-7684/1/testReport/org.apache.hadoop.hbase.regionserver/TestClearRegionBlockCache/precommit_checks___yetus_jdk17_hadoop3_checks___testClearBlockCache_1__bucket_/)
> I tested it on both the branch-2.5 and master branches, and it fails there as
> well.
>
> mvn clean test
> -Dtest=org.apache.hadoop.hbase.regionserver.TestClearRegionBlockCache#testClearBlockCache
> -pl hbase-server -am
>
> !image-2026-01-30-00-14-28-763.png!
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)