[ https://issues.apache.org/jira/browse/LUCENE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17540544#comment-17540544 ]
Thomas Hoffmann commented on LUCENE-3373: ----------------------------------------- Hello Vigya! Thank you for investigating this issue! As this thread was already quite old and I am not sure if this is the same problem which I encountered, I filed a new one. But we can focus on this thread of course. I also wrote some load tests with multiple threads and some loops to put some pressure on the IndexWriter object. Unfortunately, I couldn't reproduce the behaviour. As the same code worked for several years in our application, there must be some rare condition which must be met to get into this deadlock. You already found a possible situation, as far as I understood. As a temporary workaround we could disable the IO-Throttling of the ConcurrentMergeScheduler right? The IO-Throttling is activated by default as I can see in the sources. Maybe it is possible to reproduce the deadlock via breakpoint in the IDE and simulate/trigger the IO-Throttling? > waitForMerges deadlocks if background merge fails > ------------------------------------------------- > > Key: LUCENE-3373 > URL: https://issues.apache.org/jira/browse/LUCENE-3373 > Project: Lucene - Core > Issue Type: Bug > Components: core/index > Affects Versions: 3.0.3 > Reporter: Tim Smith > Priority: Major > > waitForMerges can deadlock if a merge fails for ConcurrentMergeScheduler > this is because the merge thread will die, but pending merges are still > available > normally, the merge thread will pick up the next merge once it finishes the > previous merge, but in the event of a merge exception, the pending work is > not resumed, but waitForMerges won't complete until all pending work is > complete > i worked around this by overriding doMerge() like so: > {code} > protected final void doMerge(MergePolicy.OneMerge merge) throws IOException > { > try { > super.doMerge(merge); > } catch (Throwable exc) { > // Just logging the exception and not rethrowing > // insert logging code here > } > } > {code} > Here's the rough steps i used to reproduce this issue: > override doMerge like so > {code} > protected final void doMerge(MergePolicy.OneMerge merge) throws IOException > { > try {Thread.sleep(500L);} catch (InterruptedException e) { } > super.doMerge(merge); > throw new IOException("fail"); > } > {code} > then, if you do the following: > loop 50 times: > addDocument // any doc > commit > waitForMerges // This will deadlock sometimes > SOLR-2017 may be related to this (stack trace for deadlock looked related) -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org