On 26/05/2015 09:31, Mark Thomas wrote: > On 25/05/2015 06:26, Rainer Jung wrote: >> When testing tcnative I observed a strange 1000 seconds timeout. The >> occurrence of the timeout was likely due to using an old OpenSSL version >> (0.9.8m) in combination with the unit tests that want a newer one, but I >> found the long duration of 1000 seconds strange. >> >> It happens in TestSsl.testSimpleSsl lines 66 and again in line 86 (TC8): >> >> 25-May-2015 00:30:27.121 SEVERE [main] >> org.apache.coyote.AbstractProtocol.init Failed to initialize end point >> associated with ProtocolHandler ["http-apr-127.0.0.1-auto-1-40424"] >> java.lang.Exception: Unable to create SSLContext. Check that SSLEngine >> is enabled in the AprLifecycleListener, the AprLifecycleListener has >> initialised correctly and that a valid SSLProtocol has been specified >> at >> org.apache.tomcat.util.net.AprEndpoint.bind(AprEndpoint.java:538) >> at >> org.apache.tomcat.util.net.AbstractEndpoint.init(AbstractEndpoint.java:737) >> >> >> As a consequence the handshakes hangs: >> >> >> Testcase: testSimpleSsl took 1,006.091 sec >> Caused an ERROR >> Read timed out >> java.net.SocketTimeoutException: Read timed out >> at java.net.SocketInputStream.socketRead0(Native Method) >> ... >> at >> sun.net.www.protocol.https.HttpsClient.afterConnect(HttpsClient.java:559) >> ... >> org.apache.tomcat.util.net.TestSsl.testSimpleSsl(TestSsl.java:67) >> >> >> I did a little svn archaelogy and I think the problem goes back to >> revision r1433976 in AprEndpoint where the following code block >> >> >> - // Check timeouts (much less frequently than polling) >> - if (maintainTime > 1000000L && running) { >> - maintainTime = 0; >> - if (socketProperties.getSoTimeout() > 0) { >> - doTimeout(connectionPollset); >> - } >> - } >> >> >> changed into >> >> >> + // Process socket timeouts >> + if (getSoTimeout() > 0 && maintain++ > 1000 && >> running) { >> + // This works and uses only one timeout >> mechanism for everything, but the >> + // non event poller might be a bit faster by >> using the old maintain. >> + maintain = 0; >> + maintain(); >> } >> >> The commit contains many more changes, but I think the problem can be >> localized in this block. In APR maintainTime is in microseconds (it is >> incremented with pollTime which is in microseconds), so 1000000L is one >> second. The new block counts maintain and I think that is incremented >> once per second, so we switch from a 1000000 microsecond = 1 second >> condition to a 1000 second condition. >> >> I'm not sure whether it is better to simply check for maintain++ > 1 (or >>> = 1) or switch back to summing up pollTime.Needs more investigation. > > Looking at it now. I think there are some simplifications that can be > made but I need to look into the more closely.
OK. I think I am done now (but I'm waiting for the CI to confirm there are no new timeout issues in async etc. that my local testing hasn't spotted). Mark --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org