[ https://issues.apache.org/jira/browse/SOLR-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16977555#comment-16977555 ]
Ilan Ginzburg commented on SOLR-13932: -------------------------------------- After some more discussions and digging into the code, the problem is clearer. There are two steps in defining which files should be pulled from Blob and how these files should be pulled (into current directory or into new directory then switch the index directory): In *step 1,* the local commit point file list is compared with Blob file list (from most recent core.metadata.<suffix> file). If file of identical names but different content are present on both sides (segments.gen is not part of commit point files), then the pull from Blob should include all files in Blob (optimizations possible), and be done into a new empty directory. If all files present in both current commit point and blob are identical, pulling the missing files from Blob can be done into the current index directory. If step 1 decided to pull into the local directory, there's another check (*step 2*): if there are ANY local files in the local index directory that have identical names to Blob files but that are different, then the pull from Blob has to happen into a new directory. For example assume local index is at generation 4, local commit point has only files on generation 4 but locally there are files (from a previous commit point) of generation 3. Assume Blob is at generation 6 and Blob commit point has files at generation 3 and 6. Comparing the commit points shows no conflicts. But comparing the existing generation 3 vs the Blob generation 3 files shows difference. A pull from Blob into current directory would fail (a pull can't overwrite local files or ongoing access to the index would see a corruption). Therefore, it is not previous local index commit points that need to be checked against the Blob content, but the current commit point + the set of files in the local index directory. > Review directory locking for Blob interactions > ---------------------------------------------- > > Key: SOLR-13932 > URL: https://issues.apache.org/jira/browse/SOLR-13932 > Project: Solr > Issue Type: Sub-task > Reporter: Ilan Ginzburg > Priority: Major > > Review resolution of local index directory content vs Blob copy. > There has been wrong understanding of following line acquiring a lock on > index directory. > {{solrCore.getDirectoryFactory().get(indexDirPath, > DirectoryFactory.DirContext.DEFAULT, > solrCore.getSolrConfig().indexConfig.lockType);}} > From Yonik: > _A couple things about Directory locking.... the locks were only ever to > prevent more than one IndexWriter from trying to modify the same index. The > IndexWriter grabs a write lock once when it is created and does not release > it until it is closed._ > _Directories are not locked on acquisition of the Directory from the > DirectoryFactory. See the IndexWriter constructor, where the lock is > explicitly grabbed._ > Review CorePushPull#pullUpdateFromBlob, ServerSideMetadata and other classes > as relevant. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org