[ https://issues.apache.org/jira/browse/HBASE-28923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HBASE-28923: ----------------------------------- Labels: pull-request-available (was: ) > Prioritize "orphan" blocks for eviction inside BucketCache.freespace > -------------------------------------------------------------------- > > Key: HBASE-28923 > URL: https://issues.apache.org/jira/browse/HBASE-28923 > Project: HBase > Issue Type: Improvement > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > Priority: Major > Labels: pull-request-available > > The Persistent Cache feature brought the ability to recover the cache in the > event of a restart or crash. Under certain conditions, the cache recovery can > lead to _orphan_ blocks hanging on the cache, causing unnecessary extra cache > usage. > For example, when a region server crashes or restarts, the original regions > on this region server would be immediately reassigned on the remaining > servers. Once the crashed/restarted server comes back online, persistent > cache will recover the cache state prior to the crash/restart, which would > contain the blocks from the regions it was holding prior to the incident. > This can lead to the _orphan_ blocks situation described above in the > following conditions: > * If balancer is off, or cache aware balancer fails to move back the very > same regions from prior the crash; > * If compaction completes for the regions on the temporary servers; > Also, with the default evictsOnClose is set to false, any region move would > leave "orphans" blocks behind. > This proposes additional logic for identifying blocks not belonging to any > files from current online regions inside BucketCache.freeSpace method. > This would need to modify both BlockCacheFactory and BucketCache to pass > along the map of online regions kept by HRegionServer. > Inside BucketCache.freeSpace method, when iterating through the backingMap > and before separating the entries between the different eviction priority > groups, we can check if the given entry belongs to a block from any of the > online regions files, using BucketCache.evictBucketEntryIfNoRpcReferenced > method to remove it if its file is not found on any of the online regions. > An additional configurable grace period should be added, to consider only > blocks cached before this grace period as potentially orphans. This is to > avoid evicting blocks from currently being written files by > flushes/compactions, when “caching on write/caching on compaction” is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010)