Wellington Chevreuil created HBASE-28923:
--------------------------------------------

             Summary: Prioritize "orphan" blocks for eviction inside 
BucketCache.freespace
                 Key: HBASE-28923
                 URL: https://issues.apache.org/jira/browse/HBASE-28923
             Project: HBase
          Issue Type: Improvement
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil


The Persistent Cache feature brought the ability to recover the cache in the 
event of a restart or crash. Under certain conditions, the cache recovery can 
lead to _orphan_ blocks hanging on the cache, causing unnecessary extra cache 
usage.

For example, when a region server crashes or restarts, the original regions on 
this region server would be immediately reassigned on the remaining servers. 
Once the crashed/restarted server comes back online, persistent cache will 
recover the cache state prior to the crash/restart, which would contain the 
blocks from the regions it was holding prior to the incident. This can lead to 
the _orphan_ blocks situation described above in the following conditions:
 * If balancer is off, or cache aware balancer fails to move back the very same 
regions from prior the crash;
 * If compaction completes for the regions on the temporary servers;

This proposes additional logic for identifying blocks not belonging to any 
files from current online regions inside BucketCache.freeSpace method. 

This would need to modify both BlockCacheFactory and BucketCache to pass along 
the map of online regions kept by HRegionServer.

Inside BucketCache.freeSpace method, when iterating through the backingMap and 
before separating the entries between the different eviction priority groups, 
we can check if the given entry belongs to a block from any of the online 
regions files, using BucketCache.evictBucketEntryIfNoRpcReferenced method to 
remove it if its file is not found on any of the online regions. 

An addition configurable grace period should be added, to consider only blocks 
cached before this grace period as potentially orphans. This is to avoid 
evicting blocks from currently being written files by flushes/compactions, when 
“caching on write/caching on compaction” is enabled.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to