"Critical poller failure" when using tcnative

2006-04-08 Thread Jeff Turner
Hi,

I'm trying to set the new issues.apache.org box up with Tomcat 5.5.16 +
tomcat-native-1.1.2 + httpd 2.2.0.  When requests come in, I'm seeing
these errors for almost every request in the logs:

Apr 8, 2006 3:28:47 PM org.apache.tomcat.util.net.AprEndpoint$Poller run
SEVERE: Critical poller failure (restarting poller): [3] No such process

We experienced this before on the old issues.apache.org box, but Remy
said he fixed it (and it is fixed on that box):

http://issues.apache.org/jira/browse/INFRA-716?page=all

I've tested both with the patched Tomcat that was working on the old box,
and Tomcat 5.5.16. I've tried with tcnative compiled against APR 1.2.2,
r383698 (the one working on the old box) and SVN head (r392544).

I've also tried starting Tomcat and then restarting httpd to reestablish
the connector connections. No difference.

If it's relevant, the httpd config is at:

https://svn.apache.org/repos/asf/infrastructure/trunk/httpd-conf/brutus.apache.org/vhosts/


Any suggestions? I can upgrade or recompile Tomcat with extra debugging if that
helps. The relevant code is AprEndpoint#run():

} else if (rv < 0) {
int errn = -rv;
/* Any non timeup or interrupted error is critical */
if ((errn != Status.TIMEUP) && (errn != Status.EINTR)) {
if (errn >  Status.APR_OS_START_USERERR) {
errn -=  Status.APR_OS_START_USERERR;
}
log.error(sm.getString("endpoint.poll.fail", "" + 
errn, Error.strerror(errn)));
// Handle poll critical failure
synchronized (this) {
destroy();
init();
}
continue;
}
}


I've appended the full stdout from startup.


Thanks,
Jeff

java version "1.5.0_04"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_04-b05)
BEA JRockit(R) (build R26.0.0-189-53463-1.5.0_04-20051122-2040-linux-ia32, )

Apr 8, 2006 3:28:10 PM org.apache.coyote.http11.Http11AprProtocol init
INFO: Initializing Coyote HTTP/1.1 on http-18080
Apr 8, 2006 3:28:10 PM org.apache.coyote.ajp.AjpAprProtocol init
INFO: Initializing Coyote AJP/1.3 on ajp-18009
Apr 8, 2006 3:28:10 PM org.apache.catalina.startup.Catalina load
INFO: Initialization processed in 1306 ms
Apr 8, 2006 3:28:11 PM org.apache.catalina.core.StandardService start
INFO: Starting service Catalina
Apr 8, 2006 3:28:11 PM org.apache.catalina.core.StandardEngine start
INFO: Starting Servlet Engine: Apache Tomcat/5.5.16
Apr 8, 2006 3:28:11 PM org.apache.catalina.core.StandardHost start
INFO: XML validation disabled
Apr 8, 2006 3:28:12 PM org.apache.catalina.session.PersistentManagerBase start
SEVERE: No Store configured, persistence disabled
...
Apr 8, 2006 3:28:23 PM org.apache.coyote.ajp.AjpAprProtocol start
INFO: Starting Coyote AJP/1.3 on ajp-18009
Apr 8, 2006 3:28:23 PM org.apache.catalina.storeconfig.StoreLoader load
INFO: Find registry server-registry.xml at classpath resource
Apr 8, 2006 3:28:23 PM org.apache.catalina.startup.Catalina start
INFO: Server startup in 12689 ms
2006-04-08 15:28:37,554 INFO [jira.web.filters.AccessLogFilter] jefft 
http://brutus.apache.org/jira/secure/Dashboard.jspa 2482+10528 4900
2006-04-08 15:28:37,554 INFO [jira.web.filters.AccessLogFilter] jefft 
http://brutus.apache.org/jira/secure/Dashboard.jspa 2482+10528 4900
2006-04-08 15:28:38,888 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 6476+18314 1046
2006-04-08 15:28:38,888 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 6476+18314 1046
2006-04-08 15:28:38,944 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 5476+18461 1060
2006-04-08 15:28:38,944 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 5476+18461 1060
2006-04-08 15:28:47,164 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 15766-5828 262
2006-04-08 15:28:47,164 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 15766-5828 262
Apr 8, 2006 3:28:47 PM org.apache.tomcat.util.net.AprEndpoint$Poller run
SEVERE: Critical poller failure (restarting poller): [3] No such process
2006-04-08 15:28:47,740 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 9995-860 152
2006-04-08 15:28:47,740 INFO [jira.web.filters.AccessLogFilter] - 
http://brutus.apache.org/jira/secure/Dashboard.jspa 9995-860 152
Apr 8, 2006 3:28:47 PM org.apache.tomcat.util.net.AprEndpoint$Poller run
SEVERE: Critical poller failure (restarting poller): [3] No such process
Apr 8, 2006 3:28:54 PM org.apache.tomcat.util.

Re: "Critical poller failure" when using tcnative

2006-04-08 Thread Jeff Turner
On Sat, Apr 08, 2006 at 06:56:06PM +0200, Remy Maucherat wrote:
> Jeff Turner wrote:
> > We experienced this before on the old issues.apache.org box, but Remy
> > said he fixed it (and it is fixed on that box):
> 
> Yes, it was a specificity of the Itanium platform which got fixed. 
> Normally, I also fixed the problem where a poller could crash when it is 
> restarted.
> 
> There's no debug you can add, I think (I had already improved the 
> logging, and it now seems complete enough). You can always post the 
> connector configuration and give details on the platform, though.

Nothing unusual. Ubuntu 5.10, 1x2.8Ghz Xeon. Java is started with:

-Djava.library.path=/usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs

tcnative is compiled with only --with-apr:

[EMAIL 
PROTECTED]:/usr/local/tomcat/apache-tomcat-5.5.16/bin/tomcat-native-1.1.2/jni/native$
 cat config.nice
#! /bin/sh
#
# Created by configure

"./configure" \
"--with-apr=/usr/local/apr-install/1.2.2" \
"$@"

I've appended what ldd reports.

If you or another Tomcat developer has time to look into this, we can arrange
access to the box. Just put a public key up somewhere.


Thanks,
Jeff


ldd /usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/*so*
/usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/libtcnative-1.so:
linux-gate.so.1 =>  (0xe000)
libapr-1.so.0 => /usr/local/apr-install/1.2.2/lib/libapr-1.so.0 
(0xb7ef3000)
librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7ee6000)
libcrypt.so.1 => /lib/tls/i686/cmov/libcrypt.so.1 (0xb7eb9000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ea7000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7ea4000)
libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0xb7e74000)
libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0xb7d76000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7c48000)
/lib/ld-linux.so.2 (0x8000)
/usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/libtcnative-1.so.0:
linux-gate.so.1 =>  (0xe000)
libapr-1.so.0 => /usr/local/apr-install/1.2.2/lib/libapr-1.so.0 
(0xb7f9f000)
librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7f92000)
libcrypt.so.1 => /lib/tls/i686/cmov/libcrypt.so.1 (0xb7f65000)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7f53000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7f5)
libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0xb7f2)
libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0xb7e22000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7cf4000)
/lib/ld-linux.so.2 (0x8000)
/usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs/libtcnative-1.so.0.2.2:
linux-gate.so.1 =>  (0xe000)
libapr-1.so.0 => /usr/local/apr-install/1.2.2/lib/libapr-1.so.0 
(0xb7f0a000)
librt.so.1 => /lib/tls/i686/cmov/librt.so.1 (0xb7efd000)
libcrypt.so.1 => /lib/tls/i686/cmov/libcrypt.so.1 (0xb7ed)
libpthread.so.0 => /lib/tls/i686/cmov/libpthread.so.0 (0xb7ebe000)
libdl.so.2 => /lib/tls/i686/cmov/libdl.so.2 (0xb7ebb000)
libssl.so.0.9.7 => /usr/lib/i686/cmov/libssl.so.0.9.7 (0xb7e8b000)
libcrypto.so.0.9.7 => /usr/lib/i686/cmov/libcrypto.so.0.9.7 (0xb7d8d000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0xb7c5f000)
/lib/ld-linux.so.2 (0x8000)
 
> Rémy
> 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: "Critical poller failure" when using tcnative

2006-04-09 Thread Jeff Turner
On Sun, Apr 09, 2006 at 08:47:22AM +0200, Mladen Turk wrote:
> Jeff Turner wrote:
> > On Sat, Apr 08, 2006 at 06:56:06PM +0200, Remy Maucherat wrote:
> >> Jeff Turner wrote:
> >>> We experienced this before on the old issues.apache.org box, but Remy
> >>> said he fixed it (and it is fixed on that box):
> > 
> > Nothing unusual. Ubuntu 5.10, 1x2.8Ghz Xeon. Java is started with:
> > 
> > -Djava.library.path=/usr/local/tomcat/tomcat/bin/tomcat-native-1.1.2/jni/native/.libs
> >
> 
> Since your log reports:
> Critical poller failure (restarting poller): [3] No such process
> 
> The error 3 is actually TCN_EINTR - APR_OS_START_USERERR, so
> it looks like you actually *did not* patch the AprEndpoint,
> or didn't copy the .jars.

Right - I am running an unpatched 5.5.16 at the moment. When I switch to the
patched (I assume - I didn't actually do the honours) 5.5.15, the error changes
to:

Apr 9, 2006 12:45:37 PM org.apache.tomcat.util.net.AprEndpoint$Poller run
SEVERE: Critical poller failure (APR does not understand this error code), 
restarting poller

> To be sure edit the:
> 
> log.error(sm.getString("endpoint.poll.fail", "" + rv, 
> Error.strerror(errn)));
> The error should be negative with 7 offset, thus -70003.
> 
> You can try the current HEAD for tcnative. It has EINTR loop
> within the poll call.

>From Remy's comment I gather 5.5.16's logging reveals as much info as is
available:

Apr 9, 2006 12:54:10 PM org.apache.tomcat.util.net.AprEndpoint$Poller run
SEVERE: Critical poller failure (restarting poller): [3] No such process


Cheers,
Jeff

> Regards,
> Mladen.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: "Critical poller failure" when using tcnative

2006-04-10 Thread Jeff Turner
On Sun, Apr 09, 2006 at 04:27:31PM +0200, Remy Maucherat wrote:
> Jeff Turner wrote:
> >> The error 3 is actually TCN_EINTR - APR_OS_START_USERERR, so
> >> it looks like you actually *did not* patch the AprEndpoint,
> >> or didn't copy the .jars.
> > 
> > Right - I am running an unpatched 5.5.16 at the moment. When I switch to the
> > patched (I assume - I didn't actually do the honours) 5.5.15, the error 
> > changes
> > to:
> 
> I just looked at the changelog, and the EINTR status code fix was post 
> 5.5.16 (it will be included in 5.5.17), so at least things are fairly 
> logical :) So you would need to either patch 5.5.16, or use tcnative HEAD.

Thanks Rémy & Mladen, that worked.

FTR, what I did was:
 - svn co http://svn.apache.org/repos/asf/tomcat/current/tc5.5.x
 - cd connectors
 - cp build.properties.default build.properties, and tweak base.path.
 - Ran 'ant download'
 - Ran 'ant dist'
 - cp dist/lib/*.jar $TOMCAT_HOME/server/lib

(the last step was a bit unscientific but it worked).


Cheers,
Jeff

> Rémy
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Memory leak? (issues.apache.org)

2006-05-04 Thread Jeff Turner
Hi,

For the last few months, issues.apache.org/jira has been running out of
memory about once a week. We've finally got it running in a profiler, and
are seeing most of the memory (eg. 486 of 572Mb) used up by char[]
buffers in BodyContentImpl. Here is a sample GC Root -> Object trace:

 char[16777218]
 cb of  org.apache.jasper.runtime.BodyContentImpl
 [0] of  org.apache.jasper.runtime.BodyContentImpl[7]
 outs of  org.apache.jasper.runtime.PageContextImpl
 [86] of  java.lang.Object[101]
 pool of  org.apache.jasper.util.SimplePool
 pool of  org.apache.jasper.runtime.JspFactoryImpl
 deflt of  javax.servlet.jsp.JspFactory
 [57] of  java.lang.Object[641]
 elementData of  java.util.Vector
 classes of  org.apache.catalina.loader.StandardClassLoader [Other]


There seems to be a constantly increasing number of BodyContentImpl
objects in the system:

1 May:  93 Objects (126Mb)
2 May:  107 Objects (263Mb)
3 May:  492 Objects (486MB)

(the first two were taken directly after a full gc)


It appears to be a case of this bug:

http://issues.apache.org/bugzilla/show_bug.cgi?id=37793

Perhaps this bug affects the ASF JIRA in particular (and not most people)
because people occasionally request huge (20-30Mb) pages. There are 23
BodyContentImpls between 33Mb and 10Mb in the last dump, and due to the
pooling, these all stick around taking up memory.

Could anyone comment on this issue? Remy seemed to think it was 'as
intended', and the bug is marked WONTFIX. I'm happy to provide yourkit
memory dumps or access to the server if necessary. We're currently
running 5.5.16.


Thanks!

Jeff

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Memory leak? (issues.apache.org)

2006-05-05 Thread Jeff Turner
On Thu, May 04, 2006 at 11:10:20AM +0200, Remy Maucherat wrote:
> Jeff Turner wrote:
> > 1 May:  93 Objects (126Mb)
> > 2 May:  107 Objects (263Mb)
> > 3 May:  492 Objects (486MB)
> 
> BodyContentImpls are pooled and reused since it makes JSP processing 
> significantly faster. If the application is evil, and uses body tags 
> with huge bodies (it seems to be the case here), then there could be a 
> problem.
> 
> Large buffers may be discarded after usage, by setting the 
> "org.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER" system property 
> to "true". Unfortunately, performance will go down and GC activity will 
> go up.

Thanks, I'll give that a try.


--Jeff

> If you would like to limit the amount of resident memory that is used, 
> you can disable pooling of buffer objects completely using the 
> "org.apache.jasper.runtime.JspFactoryImpl.USE_POOL" system property set 
> to "false". The performance will go down further due to object 
> allocations and GC activity, but you may have a few CPU cycles to spare.
> 
> BTW, you can set system properties in the catalina.properties file 
> (although you may set them on the command line too).
> 
> Rémy
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Memory leak? (issues.apache.org)

2006-05-05 Thread Jeff Turner
On Fri, May 05, 2006 at 06:07:26PM +0200, Rainer Jung wrote:
> ... and I definitely want to choose the size of one chunk not to small,
> e.g. one of the will be able to grow until 64KB, so that usually you 
> will only need one of them. Only for jumbo pages we will start gluing 
> them together...
> 
> So I'll dig my head into BodyContentImpl ...

I'd be interested in trying whatever you come up with on
issues.apache.org.  Most pages there are normal-sized, but occasionally
people request large (10Mb+) RSS/XML files.


--Jeff

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Memory leak? (issues.apache.org)

2006-05-10 Thread Jeff Turner
Hi,

On Thu, May 04, 2006 at 11:10:20AM +0200, Remy Maucherat wrote:
> Jeff Turner wrote:
> > 1 May:  93 Objects (126Mb)
> > 2 May:  107 Objects (263Mb)
> > 3 May:  492 Objects (486MB)
> 
> BodyContentImpls are pooled and reused since it makes JSP processing 
> significantly faster. If the application is evil, and uses body tags 
> with huge bodies (it seems to be the case here), then there could be a 
> problem.

Just wondering - are all JSP bodies pooled, or only tag bodies? Eg.
would this JSP's body be pooled:

<%= for (int i=0; i<100; i++) { out.print("womble"); } %>

Or only if it were wrapped in JSP tag.

> Large buffers may be discarded after usage, by setting the 
> "org.apache.jasper.runtime.BodyContentImpl.LIMIT_BUFFER" system property 
> to "true". Unfortunately, performance will go down and GC activity will 
> go up.

It appears to have fixed the leak on issues.a.o.


Cheers,
Jeff

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]