guluo2016 commented on code in PR #6439: URL: https://github.com/apache/hbase/pull/6439#discussion_r1835355418
########## hbase-server/src/main/java/org/apache/hadoop/hbase/mob/RSMobFileCleanerChore.java: ########## @@ -116,14 +120,32 @@ protected void chore() { Set<String> regionMobs = new HashSet<String>(); Path currentPath = null; try { - // collectinng referenced MOBs + // collecting referenced MOBs for (HStoreFile sf : sfs) { currentPath = sf.getPath(); - sf.initReader(); - byte[] mobRefData = sf.getMetadataValue(HStoreFile.MOB_FILE_REFS); - byte[] bulkloadMarkerData = sf.getMetadataValue(HStoreFile.BULKLOAD_TASK_KEY); - // close store file to avoid memory leaks - sf.closeStoreFile(true); + byte[] mobRefData = null; + byte[] bulkloadMarkerData = null; + boolean needCreateReader = false; + if (sf.getReader() == null) { + synchronized (sf) { + if (sf.getReader() == null) { + needCreateReader = true; + sf.initReader(); + mobRefData = sf.getMetadataValue(HStoreFile.MOB_FILE_REFS); + bulkloadMarkerData = sf.getMetadataValue(HStoreFile.BULKLOAD_TASK_KEY); + // close store file to avoid memory leaks + sf.closeStoreFile(true); + } + } + } + // If the StoreFileReader object was created by another thread, even if the reader + // has been closed now, we can still obtain the data by HStoreFile.metataMap, + // because the map will not be cleared when the reader is closed. + if (!needCreateReader) { + mobRefData = sf.getMetadataValue(HStoreFile.MOB_FILE_REFS); + bulkloadMarkerData = sf.getMetadataValue(HStoreFile.BULKLOAD_TASK_KEY); + } Review Comment: Indeed, writing code this way will be much more concise. But at the same time, I think performance also needs to be considered. When the reader does not need to be created by itself, We do not need to perform read operations in the synchronization code block because locking may cause other threads to wait. In this case, exiting the synchronization code block early may reduce the waiting time for other threads. (If there are other threads waiting to obtain this lock) Therefore, when executing the following code, I think it needs to be divided into two cases: ```java if (sf.getReader() == null) { synchronized (sf) { boolean needCreateReader = sf.getReader() == null; // Case 1: needCreateReader = true, create reader, and do something // Case 2: needCreateReader = false, exit the synchronization code block to reduce performance overhead } } ``` What do you think? I look forward to your response. Thanks a lot! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org