https://issues.apache.org/bugzilla/show_bug.cgi?id=52529
Bug #: 52529
Summary: Tomcat stops working after NullPointerException
Product: Tomcat 7
Version: 7.0.14
Platform: PC
OS/Version: Linux
Status: NEW
Severity: critical
Priority: P2
Component: Cluster
AssignedTo: [email protected]
ReportedBy: [email protected]
Classification: Unclassified
I have a cluster of 2 Tomcats, both are 7.0.14; the config for cluster is as
follows:
<!-- Cluster configuration -->
<Cluster className="org.apache.catalina.ha.tcp.SimpleTcpCluster"
channelSendOptions="8">
<Manager className="org.apache.catalina.ha.session.DeltaManager"
expireSessionsOnShutdown="false" notifyListenersOnReplication="true"/>
<Channel className="org.apache.catalina.tribes.group.GroupChannel">
<Membership
className="org.apache.catalina.tribes.membership.McastService"
address="228.0.0.4" port="50000" ttl="1"
frequency="500" dropTime="3000"
/>
<Receiver
className="org.apache.catalina.tribes.transport.nio.NioReceiver"
address="192.168.110.96" port="8000" autoBind="100"
selectorTimeout="5000" maxThreads="6"
/>
<Sender
className="org.apache.catalina.tribes.transport.ReplicationTransmitter">
<Transport
className="org.apache.catalina.tribes.transport.nio.PooledParallelSender"/>
</Sender>
<Interceptor
className="org.apache.catalina.tribes.group.interceptors.TcpFailureDetector"/>
<Interceptor
className="org.apache.catalina.tribes.group.interceptors.MessageDispatch15Interceptor"/>
</Channel>
<Valve className="org.apache.catalina.ha.tcp.ReplicationValve"
filter=""/>
<Valve
className="org.apache.catalina.ha.session.JvmRouteBinderValve"/>
<ClusterListener
className="org.apache.catalina.ha.session.JvmRouteSessionIDBinderListener"/>
<ClusterListener
className="org.apache.catalina.ha.session.ClusterSessionListener"/>
</Cluster>
It has started running this week. Thus far, every single day both Tomcats go
offline and stop responding at around the same time due to the following error:
Tomcat 1 - 2012-01-25 14:58:09,246 [Tribes-Task-Receiver-3] ERROR
org.apache.catalina.ha.session.DeltaManager- Manager [domain#]: Unable to
receive message through TCP channel
java.lang.NullPointerException
Tomcat 2 - 2012-01-25 15:00:24,427 [Tribes-Task-Receiver-5] ERROR
org.apache.catalina.ha.session.DeltaManager- Manager [other domain#]: Unable to
receive message through TCP channel
java.lang.NullPointerException
This is followed by the following until both are stopped and restarted one at a
time:
2012-01-25 15:05:22,528 [Membership-MemberExpired.] INFO
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector- Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://{192,
168, 110, 69}:8000,{192, 168, 110, 69},8000, alive=77417309, securePort=-1, UDP
Port=-1, id={110 7 26 17 -17 -10 79 10 -107 -51 -64 70 -73 -50 116 -102 },
payload={}, command={}, domain={}, ]] message. Will verify.
2012-01-25 15:05:22,528 [Membership-MemberExpired.] INFO
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector- Verification
complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 110,
69}:8000,{192, 168, 110, 69},8000, alive=77417309, securePort=-1, UDP Port=-1,
id={110 7 26 17 -17 -10 79 10 -107 -51 -64 70 -73 -50 116 -102 }, payload={},
command={}, domain={}, ]]
2012-01-25 15:07:23,278 [Membership-MemberExpired.] INFO
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector- Received
memberDisappeared[org.apache.catalina.tribes.membership.MemberImpl[tcp://{192,
168, 110, 69}:8000,{192, 168, 110, 69},8000, alive=77538070, securePort=-1, UDP
Port=-1, id={110 7 26 17 -17 -10 79 10 -107 -51 -64 70 -73 -50 116 -102 },
payload={}, command={}, domain={}, ]] message. Will verify.
2012-01-25 15:07:23,279 [Membership-MemberExpired.] INFO
org.apache.catalina.tribes.group.interceptors.TcpFailureDetector- Verification
complete. Member still
alive[org.apache.catalina.tribes.membership.MemberImpl[tcp://{192, 168, 110,
69}:8000,{192, 168, 110, 69},8000, alive=77538070, securePort=-1, UDP Port=-1,
id={110 7 26 17 -17 -10 79 10 -107 -51 -64 70 -73 -50 116 -102 }, payload={},
command={}, domain={}, ]]
This is becoming a hurdle for our platform and I'm not sure where the problem
happens as there's no stack trace in the causing exception. Is it possible to
modify Tomcat's cluster logic to be highly fault tolerant? It seems that taking
down whole Tomcat because something had happened during session sync to be a
bad and quite dangerous logic.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]