https://issues.apache.org/bugzilla/show_bug.cgi?id=53513

          Priority: P2
            Bug ID: 53513
          Assignee: dev@tomcat.apache.org
           Summary: Race condition in session replication at node startup
          Severity: major
    Classification: Unclassified
                OS: Linux
          Reporter: djo...@industrialinfo.com
          Hardware: PC
            Status: NEW
           Version: 7.0.26
         Component: Cluster
           Product: Tomcat 7

My configuration:

2 nodes running Tomcat 7.0.26
Using a custom session manager, which extends the DeltaManager

My startInternal() method first calls super.startInternal(), then performs a
few additional initializations.

I reviewed the code of DeltaManager.startInternal(), and it calls
getAllClusterSessions() which in turn calls waitForSendAllSessions(), which
requires either getStateTransfered() to return true, or a timeout.

So by this, I should be able to trust that as the second node starts, the
initial sync up of all session data from the first node has completed prior to
the startInternal() method exiting (and thus prior to my initializations).

This is, however, not the case!  I can confirm this by repeatedly logging the
value of findSessions().length during my inializations, and see that number
going up!

There appears to be a race condition between the processing of the message
containing the actual session data & the "transfer complete" message.  After
tracing this through a little further, I see the stateTransfered is set to true
in the handleALL_SESSION_TRANSFERCOMPLETE() callback method.  And that callback
is being called PRIOR to the session data itself even being received!

Here is the debug logging output (slightly scrubbed) which shows this out of
order messaging:

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager
getAllClusterSessions
INFO: Manager [wwwtest#], requesting session state from
org.apache.catalina.tribes.membership.MemberImpl[...].
This operation will timeout if no session state has been received within 60
seconds.

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager
messageReceived
FINE: Manager [wwwtest#]: Received SessionMessage of
type=(SESSION-STATE-TRANSFERED) from
[org.apache.catalina.tribes.membership.MemberImpl[...]

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager
handleALL_SESSION_TRANSFERCOMPLETE
FINE: Manager [wwwtest#] received from node [[B@6789b939:4,000] session state
transfered.

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager
messageReceived
FINE: Manager [wwwtest#]: Received SessionMessage of type=(ALL-SESSION-DATA)
from [org.apache.catalina.tribes.membership.MemberImpl[...]

Jul 5, 2012 4:20:41 PM org.apache.catalina.ha.session.DeltaManager
handleALL_SESSION_DATA
FINE: Manager [wwwtest#]: received session state data

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to