[ 
https://issues.apache.org/jira/browse/SOLR-13932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16981210#comment-16981210
 ] 

Ilan Ginzburg commented on SOLR-13932:
--------------------------------------

The Blob pull code does not depend on directory locking. Locks are only 
indirectly manipulated in CorePushPull.pullUpdateFromBlob() when an IndexWriter 
is closed before adding files to the index directory then reopening it, and 
when the index is switching directories a new index writer is created (by the 
aptly named UpdateHandler.newIndexWriter()).

The Blob protects pulls by taking a hash of the local directory files and 
making sure nothing has changed from the time the set of files to pull is 
computed until the pull has completed and the new files are made to be used 
locally (IndexWriter reopened). This is most likely not even needed anymore (it 
was needed in a previous use of the Solr/Blob interaction code in a different 
context) because pulls block ongoing indexing batches (and associated pushes) 
by an explicit read/write locking strategy (pulls acquire the WRITE lock, 
indexing acquires the READ lock for the replica).

Directory hash is taken in the {{ServerSideMetadata}} constructor (during 
update processor finish). The hash summarizes the presence (file name and size 
- likely should add timestamp since it does not change on a local directory 
unless file is modified) of all files in the directory except {{*.lock}} files.

In case of pulls, the {{ServerSideMetadata}} constructor is called by the async 
pull threads doing their work (in {{CorePullTask.pullCoreFromBlob()}}) and for 
sync pulls in {{DistributedZkUpdateProcessor.readFromSharedStoreIfNecessary()}}.

That computed hash is later used ({{CorePushPull.pullUpdateFromBlob()}}) to 
check the directory hasn't changed between the time the set of files to 
download has been computed and downloaded to a temp directory, and before they 
are incorporated into the local index.
There is a slight theoretical race condition here: if all index files are 
downloaded from the Blob store and the index is then changing to a new local 
directory, the {{IndexWriter}} is not closed before checking no local files 
have changed, but is simply opened again into the new directory 
({{UpdateHandler.newIndexWriter()}}). There could in theory be a change of the 
local directory between the check nothing has changed and the immediately 
following call to {{newIndexWriter()}}. But because pulls prevent concurrent 
indexing (pulls acquire a write lock and indexing acquire a read lock), we 
should be safe here.

Note there is some duplication in Blob pulling code between 
{{BlobStoreUtils.syncLocalCoreWithSharedStore()}} and 
{{CorePullTask.pullCoreFromBlob()}}. Likely some refactoring needed.


Pushes do not use the local directory hash check. This is in order to support 
concurrent indexing on a replica (as Lucene/Solr does by default) and allow the 
local directory to change due to a concurrent indexing batch even when a 
previous batch is pushing data out to the Blob store (otherwise we'd have to 
serialize the whole section indexing + push to blob, slowing indexing further 
than is currently done due to forcing segment writes before pushing to Blob). 
After a local update, the push to Blob store will use the latest commit point 
and push the corresponding files ({{CorePusher.pushCoreToBlob()}} building 
{{ServerSideMetadata}}). 

> Review directory locking and Blob interactions
> ----------------------------------------------
>
>                 Key: SOLR-13932
>                 URL: https://issues.apache.org/jira/browse/SOLR-13932
>             Project: Solr
>          Issue Type: Sub-task
>            Reporter: Ilan Ginzburg
>            Priority: Major
>
> Review resolution of local index directory content vs Blob copy.
> There has been wrong understanding of following line acquiring a lock on 
> index directory.
>  {{solrCore.getDirectoryFactory().get(indexDirPath, 
> DirectoryFactory.DirContext.DEFAULT, 
> solrCore.getSolrConfig().indexConfig.lockType);}}
> From Yonik:
> _A couple things about Directory locking.... the locks were only ever to 
> prevent more than one IndexWriter from trying to modify the same index. The 
> IndexWriter grabs a write lock once when it is created and does not release 
> it until it is closed._ 
> _Directories are not locked on acquisition of the Directory from the 
> DirectoryFactory. See the IndexWriter constructor, where the lock is 
> explicitly grabbed._
> Review CorePushPull#pullUpdateFromBlob, ServerSideMetadata and other classes 
> as relevant.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to