https://issues.apache.org/bugzilla/show_bug.cgi?id=49051

           Summary: Decrease in response by TcpFailureDetector.
           Product: Tomcat 6
           Version: 6.0.26
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Cluster
        AssignedTo: dev@tomcat.apache.org
        ReportedBy: kfuj...@apache.org


[Configuration]
Cluster configuration.
TcpFailureDetector is used. 
Synchronous replication

ChannelException is thrown when the destination node downs in the session
replication.
ChannelException is caught by TcpFailureDetector, 
and verifies the member in TcpFailureDetector#memberDisappeared.

In TcpFailureDetector#memberAlive method, 
the member who failed in replication is checked to see if the member really is
down.
Because member already is gone, TcpFailureDetector#memberAlive do the timeout
in 1 sec(default 1 sec).
Then, member is removed from membership by membership#removeMember, 
and super.memberDisappeared(member) will be called. 

TcpFailureDetector#memberDisappeared is as follows. 
===
public void memberDisappeared(Member member) {
...skip
    synchronized (membership) {
        //check to see if the member really is gone
        //if the payload is not a shutdown message
        if (shutdown || !memberAlive(member)) {
            //not correct, we need to maintain the map
            membership.removeMember( (MemberImpl) member);
            removeSuspects.remove(member);
            notify = true;
        } else {
            //add the member as suspect
            removeSuspects.put(member, new Long(System.currentTimeMillis()));
        }
    }
...skip
}
===
All threads to wait for the acquisition of the lock of membership call the
memberAlive method every time. 
And, the timeout will be done every time in 1 sec. 
As result,
in high-concurrent, decrease in a cruel response may happen.

For instance, 
when 100 threads waiting for the lock of membership, 
the thread to have acquired the lock at the end can not return the response for
100 sec. 

If member has not already existed in membership, TcpFailureDetector#memberAlive
method need not be called. 

I made a patch.

Best regards.

-- 
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to