[ https://issues.apache.org/jira/browse/HBASE-28923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wellington Chevreuil updated HBASE-28923: ----------------------------------------- Description: The Persistent Cache feature brought the ability to recover the cache in the event of a restart or crash. Under certain conditions, the cache recovery can lead to _orphan_ blocks hanging on the cache, causing unnecessary extra cache usage. For example, when a region server crashes or restarts, the original regions on this region server would be immediately reassigned on the remaining servers. Once the crashed/restarted server comes back online, persistent cache will recover the cache state prior to the crash/restart, which would contain the blocks from the regions it was holding prior to the incident. This can lead to the _orphan_ blocks situation described above in the following conditions: * If balancer is off, or cache aware balancer fails to move back the very same regions from prior the crash; * If compaction completes for the regions on the temporary servers; Also, with the default evictsOnClose is set to false, any region move would leave "orphans" blocks behind. This proposes additional logic for identifying blocks not belonging to any files from current online regions inside BucketCache.freeSpace method. This would need to modify both BlockCacheFactory and BucketCache to pass along the map of online regions kept by HRegionServer. Inside BucketCache.freeSpace method, when iterating through the backingMap and before separating the entries between the different eviction priority groups, we can check if the given entry belongs to a block from any of the online regions files, using BucketCache.evictBucketEntryIfNoRpcReferenced method to remove it if its file is not found on any of the online regions. An additional configurable grace period should be added, to consider only blocks cached before this grace period as potentially orphans. This is to avoid evicting blocks from currently being written files by flushes/compactions, when “caching on write/caching on compaction” is enabled. was: The Persistent Cache feature brought the ability to recover the cache in the event of a restart or crash. Under certain conditions, the cache recovery can lead to _orphan_ blocks hanging on the cache, causing unnecessary extra cache usage. For example, when a region server crashes or restarts, the original regions on this region server would be immediately reassigned on the remaining servers. Once the crashed/restarted server comes back online, persistent cache will recover the cache state prior to the crash/restart, which would contain the blocks from the regions it was holding prior to the incident. This can lead to the _orphan_ blocks situation described above in the following conditions: * If balancer is off, or cache aware balancer fails to move back the very same regions from prior the crash; * If compaction completes for the regions on the temporary servers; This proposes additional logic for identifying blocks not belonging to any files from current online regions inside BucketCache.freeSpace method. This would need to modify both BlockCacheFactory and BucketCache to pass along the map of online regions kept by HRegionServer. Inside BucketCache.freeSpace method, when iterating through the backingMap and before separating the entries between the different eviction priority groups, we can check if the given entry belongs to a block from any of the online regions files, using BucketCache.evictBucketEntryIfNoRpcReferenced method to remove it if its file is not found on any of the online regions. An additional configurable grace period should be added, to consider only blocks cached before this grace period as potentially orphans. This is to avoid evicting blocks from currently being written files by flushes/compactions, when “caching on write/caching on compaction” is enabled. > Prioritize "orphan" blocks for eviction inside BucketCache.freespace > -------------------------------------------------------------------- > > Key: HBASE-28923 > URL: https://issues.apache.org/jira/browse/HBASE-28923 > Project: HBase > Issue Type: Improvement > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > Priority: Major > > The Persistent Cache feature brought the ability to recover the cache in the > event of a restart or crash. Under certain conditions, the cache recovery can > lead to _orphan_ blocks hanging on the cache, causing unnecessary extra cache > usage. > For example, when a region server crashes or restarts, the original regions > on this region server would be immediately reassigned on the remaining > servers. Once the crashed/restarted server comes back online, persistent > cache will recover the cache state prior to the crash/restart, which would > contain the blocks from the regions it was holding prior to the incident. > This can lead to the _orphan_ blocks situation described above in the > following conditions: > * If balancer is off, or cache aware balancer fails to move back the very > same regions from prior the crash; > * If compaction completes for the regions on the temporary servers; > Also, with the default evictsOnClose is set to false, any region move would > leave "orphans" blocks behind. > This proposes additional logic for identifying blocks not belonging to any > files from current online regions inside BucketCache.freeSpace method. > This would need to modify both BlockCacheFactory and BucketCache to pass > along the map of online regions kept by HRegionServer. > Inside BucketCache.freeSpace method, when iterating through the backingMap > and before separating the entries between the different eviction priority > groups, we can check if the given entry belongs to a block from any of the > online regions files, using BucketCache.evictBucketEntryIfNoRpcReferenced > method to remove it if its file is not found on any of the online regions. > An additional configurable grace period should be added, to consider only > blocks cached before this grace period as potentially orphans. This is to > avoid evicting blocks from currently being written files by > flushes/compactions, when “caching on write/caching on compaction” is enabled. -- This message was sent by Atlassian Jira (v8.20.10#820010)