anmolnar commented on PR #7149:
URL: https://github.com/apache/hbase/pull/7149#issuecomment-3255498722

   Let's try to investigate the cases with different store file tracker 
implementations.
   
   **1. DefaultStoreFileTracker**
   
   ```java
   /**
    * The default implementation for store file tracker, where we do not 
persist the store file list,
    * and use listing when loading store files.
    */
   @InterfaceAudience.Private
   class DefaultStoreFileTracker extends StoreFileTrackerBase {
   ```
   
   So, in this case the refresh command should always get a list of all HFiles 
in the CF directory and should be able to detect new HFiles automatically. Is 
the correct?
   
   **2. File based tracker**
   
   ```java
   /**
    * A file based store file tracker.
    * <p/>
    * For this tracking way, the store file list will be persistent into a 
file, so we can write the
    * new store files directly to the final data directory, as we will not load 
the broken files. This
    * will greatly reduce the time for flush and compaction on some object 
storages as a rename is
    * actual a copy on them. And it also avoid listing when loading store file 
list, which could also
    * speed up the loading of store files as listing is also not a fast 
operation on most object
    * storages.
    */
   @InterfaceAudience.Private
   class FileBasedStoreFileTracker extends StoreFileTrackerBase {
   ```
   
   I think this is the case that you're talking about. In this case SFT might 
be able or might not be able to detect new HFiles depending on whether the SFT 
file has been updated or not. So, basically if I just copy a new file to the CF 
directory, it won't be detected, because of the reasons you mentioned. But if 
the new HFile was properly added by another cluster which is using the same SFT 
implementation, the file must have been updated properly, so our cluster will 
pick it up.
   
   If all the above are true, do we need to add any additional logic to the 
command?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to