"Critical poller failure" when using tcnative
Hi, I'm trying to set the new issues.apache.org box up with Tomcat 5.5.16 + tomcat-native-1.1.2 + httpd 2.2.0. When requests come in, I'm seeing these errors for almost every request in the logs: Apr 8, 2006 3:28:47 PM org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical poller failure (restarting poller): [3] No such process We experienced this before on the old issues.apache.org box, but Remy said he fixed it (and it is fixed on that box): http://issues.apache.org/jira/browse/INFRA-716?page=all I've tested both with the patched Tomcat that was working on the old box, and Tomcat 5.5.16. I've tried with tcnative compiled against APR 1.2.2, r383698 (the one working on the old box) and SVN head (r392544). I've also tried starting Tomcat and then restarting httpd to reestablish the connector connections. No difference. If it's relevant, the httpd config is at: https://svn.apache.org/repos/asf/infrastructure/trunk/httpd-conf/brutus.apache.org/vhosts/ Any suggestions? I can upgrade or recompile Tomcat with extra debugging if that helps. The relevant code is AprEndpoint#run(): } else if (rv < 0) { int errn = -rv; /* Any non timeup or interrupted error is critical */ if ((errn != Status.TIMEUP) && (errn != Status.EINTR)) { if (errn > Status.APR_OS_START_USERERR) { errn -= Status.APR_OS_START_USERERR; } log.error(sm.getString("endpoint.poll.fail", "" + errn, Error.strerror(errn))); // Handle poll critical failure synchronized (this) { destroy(); init(); } continue; } } I've appended the full stdout from startup. Thanks, Jeff java version "1.5.0_04" Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_04-b05) BEA JRockit(R) (build R26.0.0-189-53463-1.5.0_04-20051122-2040-linux-ia32, ) Apr 8, 2006 3:28:10 PM org.apache.coyote.http11.Http11AprProtocol init INFO: Initializing Coyote HTTP/1.1 on http-18080 Apr 8, 2006 3:28:10 PM org.apache.coyote.ajp.AjpAprProtocol init INFO: Initializing Coyote AJP/1.3 on ajp-18009 Apr 8, 2006 3:28:10 PM org.apache.catalina.startup.Catalina load INFO: Initialization processed in 1306 ms Apr 8, 2006 3:28:11 PM org.apache.catalina.core.StandardService start INFO: Starting service Catalina Apr 8, 2006 3:28:11 PM org.apache.catalina.core.StandardEngine start INFO: Starting Servlet Engine: Apache Tomcat/5.5.16 Apr 8, 2006 3:28:11 PM org.apache.catalina.core.StandardHost start INFO: XML validation disabled Apr 8, 2006 3:28:12 PM org.apache.catalina.session.PersistentManagerBase start SEVERE: No Store configured, persistence disabled ... Apr 8, 2006 3:28:23 PM org.apache.coyote.ajp.AjpAprProtocol start INFO: Starting Coyote AJP/1.3 on ajp-18009 Apr 8, 2006 3:28:23 PM org.apache.catalina.storeconfig.StoreLoader load INFO: Find registry server-registry.xml at classpath resource Apr 8, 2006 3:28:23 PM org.apache.catalina.startup.Catalina start INFO: Server startup in 12689 ms 2006-04-08 15:28:37,554 INFO [jira.web.filters.AccessLogFilter] jefft http://brutus.apache.org/jira/secure/Dashboard.jspa 2482+10528 4900 2006-04-08 15:28:37,554 INFO [jira.web.filters.AccessLogFilter] jefft http://brutus.apache.org/jira/secure/Dashboard.jspa 2482+10528 4900 2006-04-08 15:28:38,888 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 6476+18314 1046 2006-04-08 15:28:38,888 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 6476+18314 1046 2006-04-08 15:28:38,944 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 5476+18461 1060 2006-04-08 15:28:38,944 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 5476+18461 1060 2006-04-08 15:28:47,164 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 15766-5828 262 2006-04-08 15:28:47,164 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 15766-5828 262 Apr 8, 2006 3:28:47 PM org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical poller failure (restarting poller): [3] No such process 2006-04-08 15:28:47,740 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 9995-860 152 2006-04-08 15:28:47,740 INFO [jira.web.filters.AccessLogFilter] - http://brutus.apache.org/jira/secure/Dashboard.jspa 9995-860 152 Apr 8, 2006 3:28:47 PM org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical poller failure (restarting poller): [3] No such process Apr 8, 2006 3:28:54 PM org.apache.tomcat.util.
Re: "Critical poller failure" when using tcnative
On Sat, Apr 08, 2006 at 06:56:06PM +0200, Remy Maucherat wrote: > Jeff Turner wrote: > > We experienced this before on the old issues.apache.org box, but Remy > > said he fixed it (and it is fixed on that box): > > Yes, it was a specificity of the Itanium platform which got fixed. > Normally, I also fixed the problem where a poller could crash when it is > restarted. > > There's no debug you can add, I think (I had already improved the > logging, and it now seems complete enough). You can always post the > connector configuration and give details on the platform, though. Nothing unusual. Ubuntu 5.10, 1x2.8Ghz Xeon. Java is started with: -Djava.library.path=/usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs tcnative is compiled with only --with-apr: [EMAIL PROTECTED]:/usr/local/tomcat/apache-tomcat-5.5.16/bin/tomcat-native-1.1.2/jni/native$ cat config.nice #! /bin/sh # # Created by configure "./configure" \ "--with-apr=/usr/local/apr-install/1.2.2" \ "$@" I've appended what ldd reports. If you or another Tomcat developer has time to look into this, we can arrange access to the box. Just put a public key up somewhere. Thanks, Jeff ldd /usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/*so* /usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/libtcnative-1.so: linux-gate.so.1 => (0xe000) libapr-1.so.0 => /usr/local/apr-install/1.2.2/lib/libapr-1.so.0 (0xb7ef3000) librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7ee6000) libcrypt.so.1 => /lib/tls/i686/cmov/libcrypt.so.1 (0xb7eb9000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ea7000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7ea4000) libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0xb7e74000) libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0xb7d76000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7c48000) /lib/ld-linux.so.2 (0x8000) /usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/libtcnative-1.so.0: linux-gate.so.1 => (0xe000) libapr-1.so.0 => /usr/local/apr-install/1.2.2/lib/libapr-1.so.0 (0xb7f9f000) librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7f92000) libcrypt.so.1 => /lib/tls/i686/cmov/libcrypt.so.1 (0xb7f65000) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7f53000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7f5) libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0xb7f2) libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0xb7e22000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7cf4000) /lib/ld-linux.so.2 (0x8000) /usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/libtcnative-1.so.0.2.2: linux-gate.so.1 => (0xe000) libapr-1.so.0 => /usr/local/apr-install/1.2.2/lib/libapr-1.so.0 (0xb7f0a000) librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7efd000) libcrypt.so.1 => /lib/tls/i686/cmov/libcrypt.so.1 (0xb7ed) libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ebe000) libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7ebb000) libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0xb7e8b000) libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0xb7d8d000) libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7c5f000) /lib/ld-linux.so.2 (0x8000) > Rémy > - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: "Critical poller failure" when using tcnative
On Sun, Apr 09, 2006 at 08:47:22AM +0200, Mladen Turk wrote: > Jeff Turner wrote: > > On Sat, Apr 08, 2006 at 06:56:06PM +0200, Remy Maucherat wrote: > >> Jeff Turner wrote: > >>> We experienced this before on the old issues.apache.org box, but Remy > >>> said he fixed it (and it is fixed on that box): > > > > Nothing unusual. Ubuntu 5.10, 1x2.8Ghz Xeon. Java is started with: > > > > -Djava.library.path=/usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs > > > > Since your log reports: > Critical poller failure (restarting poller): [3] No such process > > The error 3 is actually TCN_EINTR - APR_OS_START_USERERR, so > it looks like you actually *did not* patch the AprEndpoint, > or didn't copy the .jars. Right - I am running an unpatched 5.5.16 at the moment. When I switch to the patched (I assume - I didn't actually do the honours) 5.5.15, the error changes to: Apr 9, 2006 12:45:37 PM org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical poller failure (APR does not understand this error code), restarting poller > To be sure edit the: > > log.error(sm.getString("endpoint.poll.fail", "" + rv, > Error.strerror(errn))); > The error should be negative with 7 offset, thus -70003. > > You can try the current HEAD for tcnative. It has EINTR loop > within the poll call. >From Remy's comment I gather 5.5.16's logging reveals as much info as is available: Apr 9, 2006 12:54:10 PM org.apache.tomcat.util.net.AprEndpoint$Poller run SEVERE: Critical poller failure (restarting poller): [3] No such process Cheers, Jeff > Regards, > Mladen. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: "Critical poller failure" when using tcnative
On Sun, Apr 09, 2006 at 04:27:31PM +0200, Remy Maucherat wrote: > Jeff Turner wrote: > >> The error 3 is actually TCN_EINTR - APR_OS_START_USERERR, so > >> it looks like you actually *did not* patch the AprEndpoint, > >> or didn't copy the .jars. > > > > Right - I am running an unpatched 5.5.16 at the moment. When I switch to the > > patched (I assume - I didn't actually do the honours) 5.5.15, the error > > changes > > to: > > I just looked at the changelog, and the EINTR status code fix was post > 5.5.16 (it will be included in 5.5.17), so at least things are fairly > logical :) So you would need to either patch 5.5.16, or use tcnative HEAD. Thanks Rémy & Mladen, that worked. FTR, what I did was: - svn co http://svn.apache.org/repos/asf/tomcat/current/tc5.5.x - cd connectors - cp build.properties.default build.properties, and tweak base.path. - Ran 'ant download' - Ran 'ant dist' - cp dist/lib/*.jar $TOMCAT_HOME/server/lib (the last step was a bit unscientific but it worked). Cheers, Jeff > Rémy > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Memory leak? (issues.apache.org)
Hi, For the last few months, issues.apache.org/jira has been running out of memory about once a week. We've finally got it running in a profiler, and are seeing most of the memory (eg. 486 of 572Mb) used up by char[] buffers in BodyContentImpl. Here is a sample GC Root -> Object trace: char[16777218] cb of org.apache.jasper.runtime.BodyContentImpl [0] of org.apache.jasper.runtime.BodyContentImpl[7] outs of org.apache.jasper.runtime.PageContextImpl [86] of java.lang.Object[101] pool of org.apache.jasper.util.SimplePool pool of org.apache.jasper.runtime.JspFactoryImpl deflt of javax.servlet.jsp.JspFactory [57] of java.lang.Object[641] elementData of java.util.Vector classes of org.apache.catalina.loader.StandardClassLoader [Other] There seems to be a constantly increasing number of BodyContentImpl objects in the system: 1 May: 93 Objects (126Mb) 2 May: 107 Objects (263Mb) 3 May: 492 Objects (486MB) (the first two were taken directly after a full gc) It appears to be a case of this bug: http://issues.apache.org/bugzilla/show_bug.cgi?id=37793 Perhaps this bug affects the ASF JIRA in particular (and not most people) because people occasionally request huge (20-30Mb) pages. There are 23 BodyContentImpls between 33Mb and 10Mb in the last dump, and due to the pooling, these all stick around taking up memory. Could anyone comment on this issue? Remy seemed to think it was 'as intended', and the bug is marked WONTFIX. I'm happy to provide yourkit memory dumps or access to the server if necessary. We're currently running 5.5.16. Thanks! Jeff - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Memory leak? (issues.apache.org)
On Thu, May 04, 2006 at 11:10:20AM +0200, Remy Maucherat wrote: > Jeff Turner wrote: > > 1 May: 93 Objects (126Mb) > > 2 May: 107 Objects (263Mb) > > 3 May: 492 Objects (486MB) > > BodyContentImpls are pooled and reused since it makes JSP processing > significantly faster. If the application is evil, and uses body tags > with huge bodies (it seems to be the case here), then there could be a > problem. > > Large buffers may be discarded after usage, by setting the > "org.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER" system property > to "true". Unfortunately, performance will go down and GC activity will > go up. Thanks, I'll give that a try. --Jeff > If you would like to limit the amount of resident memory that is used, > you can disable pooling of buffer objects completely using the > "org.apache.jasper.runtime.JspFactoryImpl.USE_POOL" system property set > to "false". The performance will go down further due to object > allocations and GC activity, but you may have a few CPU cycles to spare. > > BTW, you can set system properties in the catalina.properties file > (although you may set them on the command line too). > > Rémy > > - > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Memory leak? (issues.apache.org)
On Fri, May 05, 2006 at 06:07:26PM +0200, Rainer Jung wrote: > ... and I definitely want to choose the size of one chunk not to small, > e.g. one of the will be able to grow until 64KB, so that usually you > will only need one of them. Only for jumbo pages we will start gluing > them together... > > So I'll dig my head into BodyContentImpl ... I'd be interested in trying whatever you come up with on issues.apache.org. Most pages there are normal-sized, but occasionally people request large (10Mb+) RSS/XML files. --Jeff - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Memory leak? (issues.apache.org)
Hi, On Thu, May 04, 2006 at 11:10:20AM +0200, Remy Maucherat wrote: > Jeff Turner wrote: > > 1 May: 93 Objects (126Mb) > > 2 May: 107 Objects (263Mb) > > 3 May: 492 Objects (486MB) > > BodyContentImpls are pooled and reused since it makes JSP processing > significantly faster. If the application is evil, and uses body tags > with huge bodies (it seems to be the case here), then there could be a > problem. Just wondering - are all JSP bodies pooled, or only tag bodies? Eg. would this JSP's body be pooled: <%= for (int i=0; i<100; i++) { out.print("womble"); } %> Or only if it were wrapped in JSP tag. > Large buffers may be discarded after usage, by setting the > "org.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER" system property > to "true". Unfortunately, performance will go down and GC activity will > go up. It appears to have fixed the leak on issues.a.o. Cheers, Jeff - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]