Lupinus commented on issue #41280: URL: https://github.com/apache/doris/issues/41280#issuecomment-2449893542
The main idea is to scan the disk once and then scan the _files again. In FileCacheStorage, add a virtual function unordered_set checkConsistency(BlockFileCache* _mgr, lambda handler); where handler is a lambda used to handle inconsistent AccessKeyAndOffset entries, recording any inconsistencies found. An inconsistency means that an AccessKeyAndOffset exists only in either BlockFileCache or FSFileCacheStorage. In the implementation of checkConsistency in FSFileCacheStorage, the main task is to iterate through the fileBlock directory items under _cache_base_path, checking for their existence in _files of BlockFileCache and whether their sizes are consistent. If an entry does not exist, the handler is called; if it exists, it is recorded in an unordered_set (used for the return value). In BlockFileCache, add a function checkConsistency, which has two main parts. The first part calls the _storage’s checkConsistency, obtaining its return value (an unordered_set that records which AccessKeyAndOffset entries have already been found during the disk scan). The second part iterates through _files, and if any item is not found in the unordered_set, it calls the handler to record this inconsistency, ultimately returning these inconsistent items. In terms of the API, in FileCacheAction, add two types of operations. One is to input a path and check the consistency of that path, which essentially calls BlockFileCache's checkConsistency. The second is to obtain all paths (to facilitate the use of the first operation). Any suggestions regarding function naming? It is a frustrating issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org