I'm also hitting some threading issues with autocommit -- JConsole does not show deadlock, but it shows some threads 'BLOCKED' on scheduleCommitWithin

Perhaps this has something to do with the changes we made for: SOLR-793

I am able to fix this (at least I don't see the blocking with the data I am using) by not calling the synchronized function from within CommitTracker#addedDocument

Mike, can you look over the changes in:
 http://svn.apache.org/viewvc?view=rev&revision=719347

and make sure it makes sense to you?

Essentially it is not calling a synchronized scheduleCommitWithin() from within the (already blocked) addedDocument callback

ryan



On Nov 18, 2008, at 3:10 PM, Mike Klaas wrote:


On 18-Nov-08, at 8:54 AM, Mark Miller wrote:

Mark Miller wrote:
Toby Cole wrote:
Has anyone else experienced a deadlock when the DirectUpdateHandler2 does an autocommit? I'm using a recent snapshot from hudson (apache- solr-2008-11-12_08-06-21), and quite often when I'm loading data the server (tomcat 6) gets stuck at line 469 of DirectUpdateHandler2:

// Check if there is a commit already scheduled for longer then this time
    if( pending != null &&
        pending.getDelay(TimeUnit.MILLISECONDS) >= commitMaxTime )

Anyone got any enlightening tips?


There is some inconsistent synchronization I think. Especially involving pending. Yuck <g>
I would say there are problems with pending, autoCommitCount, and lastAddedTime. That alone could probably cause a deadlock (who knows), but it also seems somewhat possible that there is an issue with the heavy intermingling of locks (there a bunch of locks to be had in that class). I havn't looked for evidence of that though - prob makes sense to fix those 3 guys and see if you get reports from there.


autoCommitCount is written in a CommitTracker.synchronized block only. It is read to print stats in an unsynchronized fashion, which perhaps could be fixed, though I can't see how it could cause a problem

lastAddedTime is only written in a call path within a DirectUpdateHandler2.synchronized block. It is only read in a CommitTracker.synchronized block. It could read the wrong value, but I also don't see this causing a problem (a commit might fail to be scheduled). This could probably also be improved, but doesn't seem important.

pending seems to be the issue. As long as commit are only triggered by autocommit, there is no issue as manipulation of pending is always performed inside CommitTracker.synchronized. But didCommit()/ didRollback() could be called via manual commit, and pending is directly manipulated during DUH2.close(). I'm having trouble coming up with a plausible deadlock scenario, but this needs to be fixed. It isn't as easy as synchronizing didCommit/didRollback, though-- this would introduce definite deadlock scenarios.

Mark, is there any chance you could post the thread dump for the deadlocked process? Do you issue manual commits during insertion?

-Mike

Reply via email to