[ 
https://issues.apache.org/jira/browse/LUCENE-10292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated LUCENE-10292:
----------------------------------------
    Attachment: LUCENE-10292.patch
      Assignee: Chris M. Hostetter
        Status: Open  (was: Open)

{quote}seems like it should be possible/better to change 
{{AnalyzingInfixSuggester.build()}} so that the {{searcherMgr}} is only 
repalced *after* we build the new index (and/or stop using a new 
{{IndexWriter}} on every {{AnalyzingInfixSuggester.build()}} call and just do a 
{{writer.deleteAll()}} instead?)
{quote}

Digging into the code a bit more i realized:
* Using {{writer.deleteAll()}} isn't really a viable "fix" because (by default) 
the writer doesn't exist (or is closed after {{build()}} 
* Another aspect of the existing impl is that if thread T1 enters the 
{{lookup()}} method and makes it past the {{if (searcherMgr == null)}} check, 
but then another thread T2 calls {{build()}} before the T1 reaches the 
synchronized block, then T1 will be blocked for the entire durration of the 
{{build()}} ... which is certainly less then ideal.

I worked up a patch with a test case to try and  demonstrate the problem, as 
well as a fix -- note that as written the test case won't always  "fail" 
cleaning with the existing impl, it can also deadlock because of how I'm using 
a Semaphore to "slow" down the {{build()}} -- triggering the blocked lookup 
thread situation described above.

The basic idea of the fix s to decouple the synchronization/locking needed for 
changing the writer from the synchronization/locking needed to change the 
SearchManager.  and replace the SearchManager only once the {{build()}} is 
complete. 

 I originally tried to replace the "R/W" locking of SearcherManager with an 
AtomicReference so we wouldn't need to have any synchronization blocks in 
{{lookup()}} at all; but I couldn't figure out a "safe" way to do that w/o ref 
counting the SearcherManager itself (to ensure that no build/add calls close 
the SearcherManager that {{lookup()}} gets a ref to before it has a change to 
{{acquire()}} on it (thank you {{testRandomNRT}} for catching that for me)

Feedback on this approach (or suggestions for better ways to solve this) welcome

> AnalyzingInfixSuggester thread safety: lookup() fails during (re)build()
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-10292
>                 URL: https://issues.apache.org/jira/browse/LUCENE-10292
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Chris M. Hostetter
>            Assignee: Chris M. Hostetter
>            Priority: Major
>         Attachments: LUCENE-10292.patch
>
>
> I'm filing this based on anecdotal information from a Solr user w/o 
> experiencing it first hand (and I don't have a test case to demonstrate it) 
> but based on a reading of the code the underlying problem seems self 
> evident...
> With all other Lookup implementations I've examined, it is possible to call 
> {{lookup()}} regardless of whether another thread is concurrently calling 
> {{build()}} – in all cases I've seen, it is even possible to call 
> {{lookup()}} even if {{build()}} has never been called: the result is just an 
> "empty" {{List<LookupResult>}} 
> Typically this is works because the {{build()}} method uses temporary 
> datastructures until it's "build logic" is complete, at which point it 
> atomically replaces the datastructures used by the {{lookup()}} method.   In 
> the case of {{AnalyzingInfixSuggester}} however, the {{build()}} method 
> starts by closing & null'ing out the {{protected SearcherManager 
> searcherMgr}} (which it only populates again once it's completed building up 
> it's index) and then the lookup method starts with...
> {code:java}
>     if (searcherMgr == null) {
>       throw new IllegalStateException("suggester was not built");
>     }
> {code}
> ... meaning it is unsafe to call {{AnalyzingInfixSuggester.lookup()}} in any 
> situation where another thread may be calling 
> {{AnalyzingInfixSuggester.build()}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to