Sorry, wrong button was clicked.

A little more about "At certain timing, this method also throw "
SnapPuller  - java.lang.InterruptedException
at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
at java.util.concurrent.FutureTask.get(FutureTask.java:191)
at ....SnapPuller.openNewSearcherAndUpdateCommitPoint(SnapPuller.java:680)

This is a scenario (with less confident).
The old core didn't complete the close() method because of the refCount
during the reloadCore.  Then, it execute the
openNewSearcherAndUpdateCommitPoint
method.  Now, a http request, as an example, finished a process and called
the SolrCore close() method.  refCount is 0. and go to all other process in
the close() method of the SolrCore.
In this case, the InterruptedException can be thrown in the
openNewSearcherAndUpdateCommitPoint.  After that, I noticed a one thread
that executes a newSearcher process hangs and high CPU usage remains high.
We are also using a larger external field file too.



On Thu, Oct 20, 2016 at 9:11 AM, Jihwan Kim <jihwa...@gmail.com> wrote:

> A little more about "At certain timing, this method also throw "
> SnapPuller  - java.lang.InterruptedException
> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
> at ....SnapPuller.openNewSearcherAndUpdateCommitPoint(SnapPuller.java:680)
>
> This is less confident scenario.
> The old core didn't complete the close() method during the reloadCore.
> Then, it execute the openNewSearcherAndUpdateCommitPoint method.  Now, a
> http request, as an example, finished a process and called the SolrCore
> close() method.  refCount is 0. and go to all other process in the close()
>
>
> On Thu, Oct 20, 2016 at 8:44 AM, Jihwan Kim <jihwa...@gmail.com> wrote:
>
>> Hi,
>> We are using Solr 4.10.4 and experiencing out of memory exception.  It
>> seems the problem is cause by the following code & scenario.
>>
>> This is the last part of a fetchLastIndex method in SnapPuller.java
>>
>>         // we must reload the core after we open the IW back up
>>         if (reloadCore) {
>>           reloadCore();
>>         }
>>
>>         if (successfulInstall) {
>>           if (isFullCopyNeeded) {
>>             // let the system know we are changing dir's and the old one
>>             // may be closed
>>             if (indexDir != null) {
>>               LOG.info("removing old index directory " + indexDir);
>>               core.getDirectoryFactory().doneWithDirectory(indexDir);
>>               core.getDirectoryFactory().remove(indexDir);
>>             }
>>           }
>>           if (isFullCopyNeeded) {
>>             solrCore.getUpdateHandler().newIndexWriter(isFullCopyNeeded);
>>           }
>>
>>           openNewSearcherAndUpdateCommitPoint(isFullCopyNeeded);
>>         }
>>
>> Inside the reloadCore, it create a new core, register it, and try to
>> close the current/old core.  When the closing old core process goes normal,
>> it throws an exception "SnapPull failed 
>> :org.apache.solr.common.SolrException:
>> Index fetch failed Caused by java.lang.RuntimeException: Interrupted while
>> waiting for core reload to finish Caused by Caused by:
>> java.lang.InterruptedException."
>>
>> Despite this exception, the process seems OK because it just terminate
>> the SnapPuller thread but all other threads that process the closing go
>> well.
>>
>> *Now, the problem is when the close() method called during the reloadCore
>> doesn't really close the core.*
>> This is the beginning of the close() method.
>>     public void close() {
>>         int count = refCount.decrementAndGet();
>>         if (count > 0) return; // close is called often, and only
>> actually closes if nothing is using it.
>>         if (count < 0) {
>>            log.error("Too many close [count:{}] on {}. Please report this
>> exception to solr-user@lucene.apache.org", count, this );
>>            assert false : "Too many closes on SolrCore";
>>            return;
>>         }
>>         log.info(logid+" CLOSING SolrCore " + this);
>>
>> When a HTTP Request is executing, the refCount is greater than 1. So,
>> when the old core is trying to be closed during the core reload, the if
>> (count > 0) condition simply return this method.
>>
>> Then, fetchLastIndex method in SnapPuller processes next code and execute
>> "openNewSearcherAndUpdateCommitPoint".  If you look at this method, it
>> tries to open a new searcher of the solrCore which is referenced during the
>> SnapPuller constructor and I believe this one points to the old core.  At
>> certain timing, this method also throw
>> SnapPuller  - java.lang.InterruptedException
>> at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404)
>> at java.util.concurrent.FutureTask.get(FutureTask.java:191)
>> at ....SnapPuller.openNewSearcherAndUpdateCommitPoint(
>> SnapPuller.java:680)
>>
>> After this exception, things start to go bad.
>>
>> *In summary, I have two questions.*
>> 1. Can you confirm this memory / thread issue?
>> 2. When the core reload happens successfully (no matter it throws the
>> exception or not), does Solr need to call the 
>> openNewSearcherAndUpdateCommitPoint
>> method?
>>
>> Thanks.
>>
>
>

Reply via email to