I did another run for "ant test" for 7.0.35 and when using APR I got the following error the first time (never saw that before):
[junit] Running org.apache.jasper.compiler.TestParser ... [junit] Jan 18, 2013 4:37:56 PM org.apache.catalina.core.AprLifecycleListener init [junit] INFO: Loaded APR based Apache Tomcat Native library 1.1.24 using APR version 1.4.6. [junit] Jan 18, 2013 4:37:56 PM org.apache.catalina.core.AprLifecycleListener init [junit] INFO: APR capabilities: IPv6 [true], sendfile [true], accept filters [false], random [true]. [junit] Jan 18, 2013 4:37:57 PM org.apache.catalina.core.AprLifecycleListener initializeSSL [junit] INFO: OpenSSL successfully initialized (OpenSSL 1.0.1c 10 May 2012) [junit] Jan 18, 2013 4:37:57 PM org.apache.coyote.AbstractProtocol init [junit] INFO: Initializing ProtocolHandler ["http-apr-127.0.0.1-auto-2"] [junit] Jan 18, 2013 4:37:57 PM org.apache.catalina.core.StandardService startInternal [junit] INFO: Starting service Tomcat [junit] Jan 18, 2013 4:37:57 PM org.apache.catalina.core.StandardEngine startInternal [junit] INFO: Starting Servlet Engine: Apache Tomcat/7.0.35 [junit] Jan 18, 2013 4:37:57 PM org.apache.catalina.startup.ContextConfig getDefaultWebXmlFragment [junit] INFO: No global web.xml found [junit] Jan 18, 2013 4:37:57 PM org.apache.coyote.AbstractProtocol start [junit] INFO: Starting ProtocolHandler ["http-apr-127.0.0.1-auto-2-42652"] [junit] Jan 18, 2013 4:37:59 PM org.apache.coyote.AbstractProtocol pause [junit] INFO: Pausing ProtocolHandler ["http-apr-127.0.0.1-auto-2-42652"] [junit] Jan 18, 2013 4:37:59 PM org.apache.catalina.core.StandardService stopInternal [junit] INFO: Stopping service Tomcat [junit] Jan 18, 2013 4:37:59 PM org.apache.coyote.AbstractProtocol stop [junit] INFO: Stopping ProtocolHandler ["http-apr-127.0.0.1-auto-2-42652"] [junit] Jan 18, 2013 4:37:59 PM org.apache.tomcat.util.net.AprEndpoint$Poller doPoll [junit] SEVERE: Critical poller failure (restarting poller): [9] Bad file number ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ OS is Solaris 10, using Java 1.6.0_37. The ant and Tomcat processes are still there to investigate, Tomcat is looping on CPU in Thread 26. Stack is: ----------------- lwp# 26 / thread# 26 -------------------- fe024ccc apr_pool_destroy (8a6da0, b667f830, 0, 0, 16ed48, 13040d8) + 184 fc00d700 * org/apache/tomcat/jni/Pool.destroy(J)V+1478 fc00d6ac * org/apache/tomcat/jni/Pool.destroy(J)V+0 fc005ab0 * org/apache/tomcat/util/net/AprEndpoint$Poller.destroy()V+68 (line 2347) fc005ab0 * org/apache/tomcat/util/net/AprEndpoint$Poller.doPoll(J)Z+292 (line 2688) fc005868 * org/apache/tomcat/util/net/AprEndpoint$Poller.run()V+345 (line 2629) In the java world it is: "http-apr-127.0.0.1-auto-2-Poller-0" daemon prio=3 tid=0x00645400 nid=0x1a runnable [0xb667f000] java.lang.Thread.State: RUNNABLE at org.apache.tomcat.jni.Pool.destroy(Native Method) at org.apache.tomcat.util.net.AprEndpoint$Poller.destroy(AprEndpoint.java:1177) at org.apache.tomcat.util.net.AprEndpoint$Poller.doPoll(AprEndpoint.java:1362) - locked <0xe662d6a0> (a org.apache.tomcat.util.net.AprEndpoint$Poller) at org.apache.tomcat.util.net.AprEndpoint$Poller.run(AprEndpoint.java:1310) main thread is at: "main" prio=3 tid=0x00031400 nid=0x2 waiting for monitor entry [0xfe37e000] java.lang.Thread.State: BLOCKED (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xe662d6a0> (a org.apache.tomcat.util.net.AprEndpoint$Poller) at java.lang.Thread.join(Thread.java:1194) - locked <0xe662d6a0> (a org.apache.tomcat.util.net.AprEndpoint$Poller) at org.apache.tomcat.util.net.AprEndpoint$Poller.destroy(AprEndpoint.java:1183) at org.apache.tomcat.util.net.AprEndpoint.stopInternal(AprEndpoint.java:685) at org.apache.tomcat.util.net.AbstractEndpoint.stop(AbstractEndpoint.java:664) at org.apache.coyote.AbstractProtocol.stop(AbstractProtocol.java:488) at org.apache.catalina.connector.Connector.stopInternal(Connector.java:1037) at org.apache.catalina.util.LifecycleBase.stop(LifecycleBase.java:232) The code is looping in allocator_free, because it tries to traverse a pointer chain that is cyclic: (gdb) print *node $5 = {next = 0x13040d8, ref = 0x678748, index = 1, free_index = 1, first_avail = 0x8a6e50 "", endp = 0x8a8d88 ""} (gdb) print *node->next $6 = {next = 0x8a6d88, ref = 0x13040d8, index = 1, free_index = 109, first_avail = 0x1304418 "\001\060D ", endp = 0x13060d8 ""} (gdb) print *node->next->next $7 = {next = 0x13040d8, ref = 0x678748, index = 1, free_index = 1, first_avail = 0x8a6e50 "", endp = 0x8a8d88 ""} So node->next->next == node. This is an indication for a synchronization problem in the code using the pool. The allocator is: (gdb) print *allocator $8 = {max_index = 1, max_free_index = 0, current_free_index = 0, mutex = 0x32f5b8, owner = 0x32f578, free = {0x678748, 0x8a6d88, 0x0 <repeats 18 times>}} and the pool is: (gdb) print *pool $9 = {parent = 0x49c478, child = 0x0, sibling = 0x0, ref = 0xe78d20, cleanups = 0x0, free_cleanups = 0x0, allocator = 0x16ed48, subprocesses = 0x0, abort_fn = 0, user_data = 0x0, tag = 0x0, active = 0x678748, self = 0x8a6d88, self_first_avail = 0x8a6de0 "", pre_cleanups = 0x0} It should be the "serverSockPool" in AprEndpoint$Poller. Walking up the pool hierarchy: (gdb) print *pool->parent $11 = {parent = 0x479830, child = 0x1265170, sibling = 0x0, ref = 0x1018370, cleanups = 0x0, free_cleanups = 0x0, allocator = 0x16ed48, subprocesses = 0x0, abort_fn = 0, user_data = 0x0, tag = 0x0, active = 0x49c460, self = 0x49c460, self_first_avail = 0x49c4b8 "", pre_cleanups = 0x0} (gdb) print *pool->parent->parent $12 = {parent = 0x54ac40, child = 0x1018368, sibling = 0x0, ref = 0x54ac44, cleanups = 0x0, free_cleanups = 0x0, allocator = 0x16ed48, subprocesses = 0x0, abort_fn = 0, user_data = 0x0, tag = 0x0, active = 0x479818, self = 0x479818, self_first_avail = 0x479870 "", pre_cleanups = 0x0} (gdb) print *pool->parent->parent->parent $13 = {parent = 0x32f578, child = 0x479830, sibling = 0xb61a90, ref = 0x32f57c, cleanups = 0x54b4f8, free_cleanups = 0x0, allocator = 0x16ed48, subprocesses = 0x0, abort_fn = 0, user_data = 0x0, tag = 0x0, active = 0x54ac28, self = 0x54ac28, self_first_avail = 0x54ac80 "/dev/urandom", pre_cleanups = 0x0} (gdb) print *pool->parent->parent->parent->parent $14 = {parent = 0x0, child = 0x54ac40, sibling = 0x0, ref = 0x0, cleanups = 0x32f5d8, free_cleanups = 0x0, allocator = 0x16ed48, subprocesses = 0x0, abort_fn = 0, user_data = 0x0, tag = 0xfe032a60 "apr_global_pool", active = 0x32f560, self = 0x32f560, self_first_avail = 0x32f5b8 "", pre_cleanups = 0x0} Regards, Rainer --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org