[ https://issues.apache.org/jira/browse/SOLR-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16981210#comment-16981210 ]
Ilan Ginzburg commented on SOLR-13932: -------------------------------------- The Blob pull code does not depend on directory locking. Locks are only indirectly manipulated in CorePushPull.pullUpdateFromBlob() when an IndexWriter is closed before adding files to the index directory then reopening it, and when the index is switching directories a new index writer is created (by the aptly named UpdateHandler.newIndexWriter()). The Blob protects pulls by taking a hash of the local directory files and making sure nothing has changed from the time the set of files to pull is computed until the pull has completed and the new files are made to be used locally (IndexWriter reopened). This is most likely not even needed anymore (it was needed in a previous use of the Solr/Blob interaction code in a different context) because pulls block ongoing indexing batches (and associated pushes) by an explicit read/write locking strategy (pulls acquire the WRITE lock, indexing acquires the READ lock for the replica). Directory hash is taken in the {{ServerSideMetadata}} constructor (during update processor finish). The hash summarizes the presence (file name and size - likely should add timestamp since it does not change on a local directory unless file is modified) of all files in the directory except {{*.lock}} files. In case of pulls, the {{ServerSideMetadata}} constructor is called by the async pull threads doing their work (in {{CorePullTask.pullCoreFromBlob()}}) and for sync pulls in {{DistributedZkUpdateProcessor.readFromSharedStoreIfNecessary()}}. That computed hash is later used ({{CorePushPull.pullUpdateFromBlob()}}) to check the directory hasn't changed between the time the set of files to download has been computed and downloaded to a temp directory, and before they are incorporated into the local index. There is a slight theoretical race condition here: if all index files are downloaded from the Blob store and the index is then changing to a new local directory, the {{IndexWriter}} is not closed before checking no local files have changed, but is simply opened again into the new directory ({{UpdateHandler.newIndexWriter()}}). There could in theory be a change of the local directory between the check nothing has changed and the immediately following call to {{newIndexWriter()}}. But because pulls prevent concurrent indexing (pulls acquire a write lock and indexing acquire a read lock), we should be safe here. Note there is some duplication in Blob pulling code between {{BlobStoreUtils.syncLocalCoreWithSharedStore()}} and {{CorePullTask.pullCoreFromBlob()}}. Likely some refactoring needed. Pushes do not use the local directory hash check. This is in order to support concurrent indexing on a replica (as Lucene/Solr does by default) and allow the local directory to change due to a concurrent indexing batch even when a previous batch is pushing data out to the Blob store (otherwise we'd have to serialize the whole section indexing + push to blob, slowing indexing further than is currently done due to forcing segment writes before pushing to Blob). After a local update, the push to Blob store will use the latest commit point and push the corresponding files ({{CorePusher.pushCoreToBlob()}} building {{ServerSideMetadata}}). > Review directory locking and Blob interactions > ---------------------------------------------- > > Key: SOLR-13932 > URL: https://issues.apache.org/jira/browse/SOLR-13932 > Project: Solr > Issue Type: Sub-task > Reporter: Ilan Ginzburg > Priority: Major > > Review resolution of local index directory content vs Blob copy. > There has been wrong understanding of following line acquiring a lock on > index directory. > {{solrCore.getDirectoryFactory().get(indexDirPath, > DirectoryFactory.DirContext.DEFAULT, > solrCore.getSolrConfig().indexConfig.lockType);}} > From Yonik: > _A couple things about Directory locking.... the locks were only ever to > prevent more than one IndexWriter from trying to modify the same index. The > IndexWriter grabs a write lock once when it is created and does not release > it until it is closed._ > _Directories are not locked on acquisition of the Directory from the > DirectoryFactory. See the IndexWriter constructor, where the lock is > explicitly grabbed._ > Review CorePushPull#pullUpdateFromBlob, ServerSideMetadata and other classes > as relevant. > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org