On Thu, Jun 23, 2016 at 3:32 PM, Mark Thomas <ma...@apache.org> wrote: > On 23/06/2016 20:28, Nate Clark wrote: >> >> If you want details about the issue I am happy to provide them but >> just didn't wan to duplicate explaining it in a bug and in an email >> thread. > > To be honest, it doesn't really matter which product it is filed under. > It is more important to get it filed. Even more so since a tc-native > release is planned for tomorrow. Just file it under tc-native and > someone will take a look. > Hi Mark,
I tried to submit the bug but it seems that I am now unable to access bz.apache.org. Since you made it seem like it was important for this to be known about here is the info and patches. When performing some bench marking I noticed that the SSL performance of large request reads degraded heavily when performed after millions of small requests. Basically the setup is in a multi-threaded environment, about 200 threads, performing PUT requests using SSL with a body of about 4KB and then using 20 threads performing PUT requests with a body of 100MB. If the small requests are not performed the speed of the large requests in MB/s is about 2x. I tracked down the issue to ERR_clear_err() blocking on an internal lock which protects a hash map of the error states. It seems that the hash map was growing unbounded because the states in it were never being cleared by a thread when it had completed SSL operations. According to OpenSSL documents ERR_remove_thread_state() or ERR_remove_state() for versions of OpenSSL less than 1.1.0 needs to be invoked prior to a thread exiting. This is not done by threads in the native code so the hash table keeps growing and getting larger and larger and more expensive to maintain. By adding a new native call which invoked ERR_remove_thread_state and calling it from AprEndpoint in tomcat I was able to reduce the contention on the lock and the performance improved. Because of the thread pool I could not find a simple clean way to invoke the cleanup before the thread dies so instead I added it to the end of the socket processing. Here are the patches I used against tomcat-native 1.1.34 and tomcat70: >From dd4a8dcdfbb863f8e8ba5ed24070e60200421082 Mon Sep 17 00:00:00 2001 From: Nate Clark <n...@neworld.us> Date: Thu, 23 Jun 2016 16:51:04 -0400 Subject: [PATCH] Add threadCleanup function to free memory footprint of a thread OpenSSL allocates memory for an error queue for each thread which uses the SSL functions. Prior to version 1.1.0 the memory needed to released manually by either invoking ERR_remove_state or ERR_remove_thread_state. If this is not done the memory foot print growes larger which slowes down the use of the error queues. By adding the new function threadCleanup threads can now release this memroy prior to thread exit. --- native/src/ssl.c | 24 +++++++++++++++++++----- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/native/src/ssl.c b/native/src/ssl.c index ded7b22..5875a68 100644 --- a/native/src/ssl.c +++ b/native/src/ssl.c @@ -272,6 +272,14 @@ TCN_IMPLEMENT_CALL(jstring, SSL, versionString)(TCN_STDARGS) return AJP_TO_JSTRING(OPENSSL_VERSION_TEXT); } +static void ssl_err_remove(void) { +#if (OPENSSL_VERSION_NUMBER < 0x10000000L) || defined(OPENSSL_USE_DEPRECATED) + ERR_remove_state(0); +#elif (OPENSSL_VERSION_NUMBER < 0x10100000L) + ERR_remove_thread_state(NULL); +#endif +} + /* * the various processing hooks */ @@ -310,11 +318,7 @@ static apr_status_t ssl_init_cleanup(void *data) #if OPENSSL_VERSION_NUMBER >= 0x00907001 CRYPTO_cleanup_all_ex_data(); #endif -#if (OPENSSL_VERSION_NUMBER < 0x10100000L) || defined(OPENSSL_USE_DEPRECATED) - ERR_remove_state(0); -#else - ERR_remove_thread_state(NULL); -#endif + ssl_err_remove(); /* Don't call ERR_free_strings here; ERR_load_*_strings only * actually load the error strings once per process due to static @@ -1105,6 +1109,11 @@ TCN_IMPLEMENT_CALL(jboolean, SSL, hasOp)(TCN_STDARGS, jint op) return op == (op & supported_ssl_opts) ? JNI_TRUE : JNI_FALSE; } +TCN_IMPLEMENT_CALL(void, SSL, threadCleanup)(TCN_STDARGS) { + UNREFERENCED(o); + ssl_err_remove(); +} + #else /* OpenSSL is not supported. * Create empty stubs. @@ -1231,4 +1240,9 @@ TCN_IMPLEMENT_CALL(jboolean, SSL, hasOp)(TCN_STDARGS, jint op) UNREFERENCED(op); return JNI_FALSE; } + +TCN_IMPLEMENT_CALL(void, SSL, threadCleanup)(TCN_STDARGS) { + UNREFERENCED_STDARGS; +} + #endif -- 2.5.5 >From c5c1756153279e87f11594c0e053d5301ed121e7 Mon Sep 17 00:00:00 2001 From: Nate Clark <n...@neworld.us> Date: Thu, 23 Jun 2016 16:57:18 -0400 Subject: [PATCH] Invoke SSL.threadCleanup when done with socket processing When a thread is done processing an ssl socket it should invoke SSL.threadCleanup() to free memory associated with that thread in the SSL library. If this is not done the memory foot print bloats and each subsequent invocation will become slightly slower as there is more thread contention. --- java/org/apache/tomcat/jni/SSL.java | 6 ++++++ java/org/apache/tomcat/util/net/AprEndpoint.java | 23 +++++++++++++++++++---- 2 files changed, 25 insertions(+), 4 deletions(-) diff --git a/java/org/apache/tomcat/jni/SSL.java b/java/org/apache/tomcat/jni/SSL.java index b726a54..fb9fc16 100644 --- a/java/org/apache/tomcat/jni/SSL.java +++ b/java/org/apache/tomcat/jni/SSL.java @@ -388,4 +388,10 @@ public final class SSL { * @return true if all SSL_OP_* are supported by OpenSSL library. */ public static native boolean hasOp(int op); + + /** + * Cleanup thread specific data structures. This needs to be called after a thread + * has completed ssl operations. + */ + public static native void threadCleanup(); } diff --git a/java/org/apache/tomcat/util/net/AprEndpoint.java b/java/org/apache/tomcat/util/net/AprEndpoint.java index 74d813f..be208c4 100644 --- a/java/org/apache/tomcat/util/net/AprEndpoint.java +++ b/java/org/apache/tomcat/util/net/AprEndpoint.java @@ -2411,6 +2411,21 @@ public class AprEndpoint extends AbstractEndpoint<Long> { } + private abstract class BaseSocketProcessor implements Runnable { + @Override + public void run() { + try { + realRun(); + } finally { + if (isSSLEnabled()) { + SSL.threadCleanup(); + } + } + } + + protected abstract void realRun(); + } + // --------------------------------- SocketWithOptionsProcessor Inner Class /** @@ -2420,7 +2435,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> { * * This is called after an accept(). */ - protected class SocketWithOptionsProcessor implements Runnable { + protected class SocketWithOptionsProcessor extends BaseSocketProcessor { protected SocketWrapper<Long> socket = null; @@ -2430,7 +2445,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> { } @Override - public void run() { + public void realRun() { synchronized (socket) { if (!deferAccept) { @@ -2476,7 +2491,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> { * This class is the equivalent of the Worker, but will simply use in an * external Executor thread pool. */ - protected class SocketProcessor implements Runnable { + protected class SocketProcessor extends BaseSocketProcessor { private final SocketWrapper<Long> socket; private final SocketStatus status; @@ -2492,7 +2507,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> { } @Override - public void run() { + public void realRun() { // Upgraded connections need to allow multiple threads to access the // connection at the same time to enable blocking IO to be used when -- 2.5.5 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org