On Thu, Jun 23, 2016 at 3:32 PM, Mark Thomas <ma...@apache.org> wrote:
> On 23/06/2016 20:28, Nate Clark wrote:
>>
>> If you want details about the issue I am happy to provide them but
>> just didn't wan to duplicate explaining it in a bug and in an email
>> thread.
>
> To be honest, it doesn't really matter which product it is filed under.
> It is more important to get it filed. Even more so since a tc-native
> release is planned for tomorrow. Just file it under tc-native and
> someone will take a look.
>
Hi Mark,

I tried to submit the bug but it seems that I am now unable to access
bz.apache.org. Since you made it seem like it was important for this
to be known about here is the info and patches.

When performing some bench marking I noticed that the SSL performance
of large request reads degraded heavily when performed after millions
of small requests. Basically the setup is in a multi-threaded
environment, about 200 threads, performing PUT requests using SSL with
a body of about 4KB and then using 20 threads performing PUT requests
with a body of 100MB. If the small requests are not performed the
speed of the large requests in MB/s is about 2x.

I tracked down the issue to ERR_clear_err() blocking on an internal
lock which protects a hash map of the error states. It seems that the
hash map was growing unbounded because the states in it were never
being cleared by a thread when it had completed SSL operations.

According to OpenSSL documents ERR_remove_thread_state() or
ERR_remove_state() for versions of OpenSSL less than 1.1.0 needs to be
invoked prior to a thread exiting. This is not done by threads in the
native code so the hash table keeps growing and getting larger and
larger and more expensive to maintain.

By adding a new native call which invoked ERR_remove_thread_state and
calling it from AprEndpoint in tomcat I was able to reduce the
contention on the lock and the performance improved.

Because of the thread pool I could not find a simple clean way to
invoke the cleanup before the thread dies so instead I added it to the
end of the socket processing.

Here are the patches I used against tomcat-native 1.1.34 and tomcat70:

>From dd4a8dcdfbb863f8e8ba5ed24070e60200421082 Mon Sep 17 00:00:00 2001
From: Nate Clark <n...@neworld.us>
Date: Thu, 23 Jun 2016 16:51:04 -0400
Subject: [PATCH] Add threadCleanup function to free memory footprint of a
 thread

OpenSSL allocates memory for an error queue for each thread which uses
the SSL functions. Prior to
version 1.1.0 the memory needed to released manually by either
invoking ERR_remove_state or
ERR_remove_thread_state. If this is not done the memory foot print
growes larger which slowes down
the use of the error queues. By adding the new function threadCleanup
threads can now release this
memroy prior to thread exit.
---
 native/src/ssl.c | 24 +++++++++++++++++++-----
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/native/src/ssl.c b/native/src/ssl.c
index ded7b22..5875a68 100644
--- a/native/src/ssl.c
+++ b/native/src/ssl.c
@@ -272,6 +272,14 @@ TCN_IMPLEMENT_CALL(jstring, SSL,
versionString)(TCN_STDARGS)
     return AJP_TO_JSTRING(OPENSSL_VERSION_TEXT);
 }

+static void ssl_err_remove(void) {
+#if (OPENSSL_VERSION_NUMBER < 0x10000000L) || defined(OPENSSL_USE_DEPRECATED)
+    ERR_remove_state(0);
+#elif (OPENSSL_VERSION_NUMBER < 0x10100000L)
+    ERR_remove_thread_state(NULL);
+#endif
+}
+
 /*
  *  the various processing hooks
  */
@@ -310,11 +318,7 @@ static apr_status_t ssl_init_cleanup(void *data)
 #if OPENSSL_VERSION_NUMBER >= 0x00907001
     CRYPTO_cleanup_all_ex_data();
 #endif
-#if (OPENSSL_VERSION_NUMBER < 0x10100000L) || defined(OPENSSL_USE_DEPRECATED)
-    ERR_remove_state(0);
-#else
-    ERR_remove_thread_state(NULL);
-#endif
+    ssl_err_remove();

     /* Don't call ERR_free_strings here; ERR_load_*_strings only
      * actually load the error strings once per process due to static
@@ -1105,6 +1109,11 @@ TCN_IMPLEMENT_CALL(jboolean, SSL,
hasOp)(TCN_STDARGS, jint op)
     return op == (op & supported_ssl_opts) ? JNI_TRUE : JNI_FALSE;
 }

+TCN_IMPLEMENT_CALL(void, SSL, threadCleanup)(TCN_STDARGS) {
+    UNREFERENCED(o);
+    ssl_err_remove();
+}
+
 #else
 /* OpenSSL is not supported.
  * Create empty stubs.
@@ -1231,4 +1240,9 @@ TCN_IMPLEMENT_CALL(jboolean, SSL,
hasOp)(TCN_STDARGS, jint op)
     UNREFERENCED(op);
     return JNI_FALSE;
 }
+
+TCN_IMPLEMENT_CALL(void, SSL, threadCleanup)(TCN_STDARGS) {
+    UNREFERENCED_STDARGS;
+}
+
 #endif
-- 
2.5.5

>From c5c1756153279e87f11594c0e053d5301ed121e7 Mon Sep 17 00:00:00 2001
From: Nate Clark <n...@neworld.us>
Date: Thu, 23 Jun 2016 16:57:18 -0400
Subject: [PATCH] Invoke SSL.threadCleanup when done with socket processing

When a thread is done processing an ssl socket it should invoke
SSL.threadCleanup() to free memory
associated with that thread in the SSL library. If this is not done
the memory foot print bloats and
each subsequent invocation will become slightly slower as there is
more thread contention.
---
 java/org/apache/tomcat/jni/SSL.java              |  6 ++++++
 java/org/apache/tomcat/util/net/AprEndpoint.java | 23 +++++++++++++++++++----
 2 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/java/org/apache/tomcat/jni/SSL.java
b/java/org/apache/tomcat/jni/SSL.java
index b726a54..fb9fc16 100644
--- a/java/org/apache/tomcat/jni/SSL.java
+++ b/java/org/apache/tomcat/jni/SSL.java
@@ -388,4 +388,10 @@ public final class SSL {
      * @return true if all SSL_OP_* are supported by OpenSSL library.
      */
     public static native boolean hasOp(int op);
+
+    /**
+     * Cleanup thread specific data structures. This needs to be
called after a thread
+     * has completed ssl operations.
+     */
+    public static native void threadCleanup();
 }
diff --git a/java/org/apache/tomcat/util/net/AprEndpoint.java
b/java/org/apache/tomcat/util/net/AprEndpoint.java
index 74d813f..be208c4 100644
--- a/java/org/apache/tomcat/util/net/AprEndpoint.java
+++ b/java/org/apache/tomcat/util/net/AprEndpoint.java
@@ -2411,6 +2411,21 @@ public class AprEndpoint extends AbstractEndpoint<Long> {
     }


+    private abstract class BaseSocketProcessor implements Runnable {
+        @Override
+        public void run() {
+            try {
+                realRun();
+            } finally {
+                if (isSSLEnabled()) {
+                    SSL.threadCleanup();
+                }
+            }
+        }
+
+        protected abstract void realRun();
+    }
+
     // --------------------------------- SocketWithOptionsProcessor Inner Class

     /**
@@ -2420,7 +2435,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> {
      *
      * This is called after an accept().
      */
-    protected class SocketWithOptionsProcessor implements Runnable {
+    protected class SocketWithOptionsProcessor extends BaseSocketProcessor {

         protected SocketWrapper<Long> socket = null;

@@ -2430,7 +2445,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> {
         }

         @Override
-        public void run() {
+        public void realRun() {

             synchronized (socket) {
                 if (!deferAccept) {
@@ -2476,7 +2491,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> {
      * This class is the equivalent of the Worker, but will simply use in an
      * external Executor thread pool.
      */
-    protected class SocketProcessor implements Runnable {
+    protected class SocketProcessor extends BaseSocketProcessor {

         private final SocketWrapper<Long> socket;
         private final SocketStatus status;
@@ -2492,7 +2507,7 @@ public class AprEndpoint extends AbstractEndpoint<Long> {
         }

         @Override
-        public void run() {
+        public void realRun() {

             // Upgraded connections need to allow multiple threads to
access the
             // connection at the same time to enable blocking IO to
be used when
-- 
2.5.5

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to