Bug that spans tomcat and tomcat-native

2016-06-23 Thread Nate Clark
Hello,

I encountered a performance issue when using the apr protocol. I
tracked it down to basically missing a piece of the integration with
openssl. I was able to fix the performance issue but it required a
change to both tomcat-native and tomcat. I was going to file a bug
with my patches but tomcat-native and tomcat are separate projects so
I was wondering the best way to handle that. File an issue in
tomcat-native and one in tomcat referencing the other bug? Or just use
this mailing list to discuss the issue?

If you want details about the issue I am happy to provide them but
just didn't wan to duplicate explaining it in a bug and in an email
thread.

Thanks,
-nate

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Bug that spans tomcat and tomcat-native

2016-06-23 Thread Nate Clark
On Thu, Jun 23, 2016 at 3:32 PM, Mark Thomas  wrote:
> On 23/06/2016 20:28, Nate Clark wrote:
>>
>> If you want details about the issue I am happy to provide them but
>> just didn't wan to duplicate explaining it in a bug and in an email
>> thread.
>
> To be honest, it doesn't really matter which product it is filed under.
> It is more important to get it filed. Even more so since a tc-native
> release is planned for tomorrow. Just file it under tc-native and
> someone will take a look.
>
Hi Mark,

I tried to submit the bug but it seems that I am now unable to access
bz.apache.org. Since you made it seem like it was important for this
to be known about here is the info and patches.

When performing some bench marking I noticed that the SSL performance
of large request reads degraded heavily when performed after millions
of small requests. Basically the setup is in a multi-threaded
environment, about 200 threads, performing PUT requests using SSL with
a body of about 4KB and then using 20 threads performing PUT requests
with a body of 100MB. If the small requests are not performed the
speed of the large requests in MB/s is about 2x.

I tracked down the issue to ERR_clear_err() blocking on an internal
lock which protects a hash map of the error states. It seems that the
hash map was growing unbounded because the states in it were never
being cleared by a thread when it had completed SSL operations.

According to OpenSSL documents ERR_remove_thread_state() or
ERR_remove_state() for versions of OpenSSL less than 1.1.0 needs to be
invoked prior to a thread exiting. This is not done by threads in the
native code so the hash table keeps growing and getting larger and
larger and more expensive to maintain.

By adding a new native call which invoked ERR_remove_thread_state and
calling it from AprEndpoint in tomcat I was able to reduce the
contention on the lock and the performance improved.

Because of the thread pool I could not find a simple clean way to
invoke the cleanup before the thread dies so instead I added it to the
end of the socket processing.

Here are the patches I used against tomcat-native 1.1.34 and tomcat70:

>From dd4a8dcdfbb863f8e8ba5ed24070e60200421082 Mon Sep 17 00:00:00 2001
From: Nate Clark 
Date: Thu, 23 Jun 2016 16:51:04 -0400
Subject: [PATCH] Add threadCleanup function to free memory footprint of a
 thread

OpenSSL allocates memory for an error queue for each thread which uses
the SSL functions. Prior to
version 1.1.0 the memory needed to released manually by either
invoking ERR_remove_state or
ERR_remove_thread_state. If this is not done the memory foot print
growes larger which slowes down
the use of the error queues. By adding the new function threadCleanup
threads can now release this
memroy prior to thread exit.
---
 native/src/ssl.c | 24 +++-
 1 file changed, 19 insertions(+), 5 deletions(-)

diff --git a/native/src/ssl.c b/native/src/ssl.c
index ded7b22..5875a68 100644
--- a/native/src/ssl.c
+++ b/native/src/ssl.c
@@ -272,6 +272,14 @@ TCN_IMPLEMENT_CALL(jstring, SSL,
versionString)(TCN_STDARGS)
 return AJP_TO_JSTRING(OPENSSL_VERSION_TEXT);
 }

+static void ssl_err_remove(void) {
+#if (OPENSSL_VERSION_NUMBER < 0x1000L) || defined(OPENSSL_USE_DEPRECATED)
+ERR_remove_state(0);
+#elif (OPENSSL_VERSION_NUMBER < 0x1010L)
+ERR_remove_thread_state(NULL);
+#endif
+}
+
 /*
  *  the various processing hooks
  */
@@ -310,11 +318,7 @@ static apr_status_t ssl_init_cleanup(void *data)
 #if OPENSSL_VERSION_NUMBER >= 0x00907001
 CRYPTO_cleanup_all_ex_data();
 #endif
-#if (OPENSSL_VERSION_NUMBER < 0x1010L) || defined(OPENSSL_USE_DEPRECATED)
-ERR_remove_state(0);
-#else
-ERR_remove_thread_state(NULL);
-#endif
+ssl_err_remove();

 /* Don't call ERR_free_strings here; ERR_load_*_strings only
  * actually load the error strings once per process due to static
@@ -1105,6 +1109,11 @@ TCN_IMPLEMENT_CALL(jboolean, SSL,
hasOp)(TCN_STDARGS, jint op)
 return op == (op & supported_ssl_opts) ? JNI_TRUE : JNI_FALSE;
 }

+TCN_IMPLEMENT_CALL(void, SSL, threadCleanup)(TCN_STDARGS) {
+UNREFERENCED(o);
+ssl_err_remove();
+}
+
 #else
 /* OpenSSL is not supported.
  * Create empty stubs.
@@ -1231,4 +1240,9 @@ TCN_IMPLEMENT_CALL(jboolean, SSL,
hasOp)(TCN_STDARGS, jint op)
 UNREFERENCED(op);
 return JNI_FALSE;
 }
+
+TCN_IMPLEMENT_CALL(void, SSL, threadCleanup)(TCN_STDARGS) {
+UNREFERENCED_STDARGS;
+}
+
 #endif
-- 
2.5.5

>From c5c1756153279e87f11594c0e053d5301ed121e7 Mon Sep 17 00:00:00 2001
From: Nate Clark 
Date: Thu, 23 Jun 2016 16:57:18 -0400
Subject: [PATCH] Invoke SSL.threadCleanup when done with socket processing

When a thread is done processing an ssl socket it should invoke
SSL.threadCleanup() to free memory
associated with that thread in the SSL library. If this is not done
the memory foot print bloats and
each subsequent invocation wi

Re: Bug that spans tomcat and tomcat-native

2016-06-24 Thread Nate Clark
On Fri, Jun 24, 2016 at 1:27 PM, Mark Thomas  wrote:
> On 24/06/2016 18:25, Mark Thomas wrote:
>>
>> Can you provide the settings you are using for the Executor as well please?



>
> And how long do the initial 5,000,000 4k requests take to process?
>

40 minutes.

-nate

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Bug that spans tomcat and tomcat-native

2016-06-24 Thread Nate Clark
On Fri, Jun 24, 2016 at 1:37 PM, Nate Clark  wrote:
> On Fri, Jun 24, 2016 at 1:27 PM, Mark Thomas  wrote:
>> On 24/06/2016 18:25, Mark Thomas wrote:
>>>
>>> Can you provide the settings you are using for the Executor as well please?
>
>  maxThreads="500" minSpareThreads="4"/>
>
>>
>> And how long do the initial 5,000,000 4k requests take to process?
>>
>
> 40 minutes.
>
Not sure this matters but I just double checked and there are actually
400 threads in total doing the 4k PUTs. Two clients each doing 200
threads. the 100MB test is 24 threads total 12 per client machine.

Sorry for misinformation earlier.

-nate

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Bug that spans tomcat and tomcat-native

2016-06-27 Thread Nate Clark
On Mon, Jun 27, 2016 at 9:11 AM, Mark Thomas  wrote:
> I believe I have an explanation for what is going on that fits both the
> reported behaviour and the proposed fix.
>
> Background
> ==
>
> OpenSSL tracks a list of the most recent errors for each thread in a
> hash map keyed on the thread (int_thread_hash in err.c). Reading and
> writing to this hash map is protected by a lock. The hash map is created
> and populated lazily.
>
> tc-native calls ERR_clear_error() before every call to
> SSL_do_handshake(), SSL_read() and SSL_write(). The call to
> ERR_clear_error() either clears the error list for the current thread or
> inserts a new empty list into the hash map of the thread is not already
> present.
>
> The performance problem was tracked down to threads waiting in
> ERR_clear_error() to obtain the write lock for the hash map.
>
> The proposed solution was to call ERR_remove_thread_state() just before
> the current Tomcat thread processing the connection is returned to the
> thread pool. This method removes the current thread and its associated
> error list from the hash map.
>
>
> Analysis
> 
>
> The proposed solution, calling ERR_remove_thread_state(), adds a call
> that also obtains the write lock for the hash map. This indicates that
> the problem is not delays in obtaining the lock but contention for the
> lock because one or more operations taking place within the lock are
> taking a long time.
>
> Removing unused threads from the hash map removes the bottleneck. This
> points towards the hash map being the source of the problem.
>
> Testing by the OP showed that as soon as a test had been ran that
> required ~ 400 concurrent threads performance dropped significantly. It
> did not get noticeably worse if the same 400 thread test was run repeatedly.
>
> My testing indicated, on OSX at least, that the thread IDs used in the
> hash map were stable and that uncontrolled growth of the hash map was
> unlikely to be the cause.
>
> The manner in which thread IDs are generated varies by platform. On
> Linux, where this problem was observed, the thread ID is derived from
> (is normally equal to) the memory address of the per thread errno
> variable. This means that thread IDs tend to be concentrated in a
> relatively narrow range of values. For example, in a simple 10 thread
> test on OSX thread IDs ranged from 123145344839680 to 123145354387455.
>
> Thread IDs therefore fall with a 10^7 range within a possible range of
> 1.8x10^19. i.e. a very small, contiguous range.
>
> Hash maps use hashing functions to ensure that entries are (roughly)
> evenly distributed between the available buckets. The hash function,
> err_state_hash, used for the thread IDs in OpenSSL is threadID * 13.
>
> Supposition
> ===
>
> The hash function used (multiple by 13) is insufficient to distribute
> the resulting values across multiple buckets because they will still
> fall in a relatively narrow band. Therefore all the threads end up in a
> single bucket which makes the performance of the hash map poor. This in
> turn makes calls to thread_get_item() slow because it does a hash map
> lookup. This lookup is performed with the read lock held for the hash
> map which in turn will slow down the calls that require the write lock.

This makes sense. Looking at openssl source code a bit more, I think
it is a combination of the hash function and bucket selection in the
hash map. The bucket which is chosen is basically done by a low bit
mask since pmax and num_alloc_nodes are always powers of 2. With 400
entries pmax=128 and num_alloc_nodes=256, making the bucket selection
either hash & 0x7F or hash & 0xFF. The errno value memory locations
are probably aligned and often that mask will result in the same
bucket being chosen.

> Proposal
> 
>
> The analysis and supposition above need to be checked by someone with a
> better understanding of C than me. Assuming my work is correct, the next
> step is to look at possible fixes. I do not believe that patching
> OpenSSL is a viable option.
>
> The OpenSSL API needs to be reviewed to see if there is a way to avoid
> the calls that require the write lock.
>
> If the write lock cannot be avoided then we need to see if there is a
> better place to call ERR_remove_thread_state(). I'd like to fix this
> entirely in tc-native but that may mean calling
> ERR_remove_thread_state() more frequently which could create its own
> performance problems.
>
> Nate - I may have some patches for you to test in the next few days.

Sounds good.

Thanks,
-nate

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Bug that spans tomcat and tomcat-native

2016-06-27 Thread Nate Clark
On Mon, Jun 27, 2016 at 12:17 PM,   wrote:
> On Mon, Jun 27, 2016 at 12:04 PM,   wrote:
>> On Mon, Jun 27, 2016 at 11:54 AM, Rainer Jung  
>> wrote:
>>> Great analysis. I was really wondering, what could make the hash map so huge
>>> and hadn't thought about the hash function as the problem.
>>>
>>> Before OpenSSL 1.1.0 there's a callback for applications to provide their
>>> own thread IDs:
>>>
>>> https://www.openssl.org/docs/man1.0.2/crypto/CRYPTO_THREADID_set_callback.html
>>>
>>> So we could probably work around the problem of the poor hashing function by
>>> passing in IDs that work for hashing (pre-hashed ID?). Of course then we
>>> loose the direct association of the OpenSSL thread ID with the real platform
>>> thread id.
>>>
>>> Currently our callback in tcnative is ssl_set_thread_id() which refers to
>>> ssl_thread_id(), which on Linux gets the ID from the APR function
>>> apr_os_thread_current(). So we could add some hashing formula in
>>> ssl_thread_id().
>>
>> I think just using the real thread id would work, since now it isn't
>> using the real thread id instead it is using the address location of
>> errno. If this was the real thread id I think the hash algorithm and
>> bucket selection they have now will work much better since the thread
>> ids are basically numerically increasing and aren't aligned to powers
>> of 2.
>
> Sorry, I need to read code a bit closer. I misread the
> OPENSSL_VERSION_NUMBER < 0x1010L as version < 1.0.1 and not 1.1.0.
>
> I would still expect real thread ids to provide enough of a
> distribution in a map where the hash bucket ends up being is ((ID *
> 13) & 0X7F).

OK, this now makes more sense apr_os_thread_current() calls
pthread_self() which returns a pthred_t which is basically a handle
for pthread's internal state and not an actual ID. It might be better
to use gettid(), but that isn't as portable.

-nate

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Bug that spans tomcat and tomcat-native

2016-06-27 Thread Nate Clark
On Mon, Jun 27, 2016 at 2:13 PM, Rainer Jung  wrote:
> Am 27.06.2016 um 18:47 schrieb Nate Clark:
>>
>> On Mon, Jun 27, 2016 at 12:17 PM,   wrote:
>>>
>>> On Mon, Jun 27, 2016 at 12:04 PM,   wrote:
>>>>
>>>> On Mon, Jun 27, 2016 at 11:54 AM, Rainer Jung 
>>>> wrote:
>>>>>
>>>>> Great analysis. I was really wondering, what could make the hash map so
>>>>> huge
>>>>> and hadn't thought about the hash function as the problem.
>>>>>
>>>>> Before OpenSSL 1.1.0 there's a callback for applications to provide
>>>>> their
>>>>> own thread IDs:
>>>>>
>>>>>
>>>>> https://www.openssl.org/docs/man1.0.2/crypto/CRYPTO_THREADID_set_callback.html
>>>>>
>>>>> So we could probably work around the problem of the poor hashing
>>>>> function by
>>>>> passing in IDs that work for hashing (pre-hashed ID?). Of course then
>>>>> we
>>>>> loose the direct association of the OpenSSL thread ID with the real
>>>>> platform
>>>>> thread id.
>>>>>
>>>>> Currently our callback in tcnative is ssl_set_thread_id() which refers
>>>>> to
>>>>> ssl_thread_id(), which on Linux gets the ID from the APR function
>>>>> apr_os_thread_current(). So we could add some hashing formula in
>>>>> ssl_thread_id().
>>>>
>>>>
>>>> I think just using the real thread id would work, since now it isn't
>>>> using the real thread id instead it is using the address location of
>>>> errno. If this was the real thread id I think the hash algorithm and
>>>> bucket selection they have now will work much better since the thread
>>>> ids are basically numerically increasing and aren't aligned to powers
>>>> of 2.
>>>
>>>
>>> Sorry, I need to read code a bit closer. I misread the
>>> OPENSSL_VERSION_NUMBER < 0x1010L as version < 1.0.1 and not 1.1.0.
>>>
>>> I would still expect real thread ids to provide enough of a
>>> distribution in a map where the hash bucket ends up being is ((ID *
>>> 13) & 0X7F).
>>
>>
>> OK, this now makes more sense apr_os_thread_current() calls
>> pthread_self() which returns a pthred_t which is basically a handle
>> for pthread's internal state and not an actual ID. It might be better
>> to use gettid(), but that isn't as portable.
>
>
> I wouldn't care too much about portability right now. If you can prove
> Mark's analysis by using gettid() on Linux and checking that this fixes the
> performance issue, that would very useful feedback. If it doesn't help, we
> can go down the path of using the hash statistics to get more info about
> what's happening. If it helps, we can think further about how to reduce hash
> collisions fo every platform.

I can try that, but unfortunately will not be able to run the test
right away. The machine I was using is a shared resource, which
somebody else needs for the next few days.

Even if that does help the performance and makes it so the hash map
has better distribution. It won't help with long term bloat. If a
system is running long enough and threads are started and stopped the
hash map could grow to include up to almost max pid entries, 32768 by
default on Linux. With the size of the ERR_STATE struct that will be
roughly 12MB of memory, not horrible but not ideal. I still think
ERR_remove_thread_state() should probably be invoked prior to a thread
exiting. The only way I can think of doing that and keeping everything
in native and not call it for every request would be to call
pthread_cleanup_push(ERR_remove_thread_state, NULL) the first time a
thread is used to perform any ssl operation. That would then require
you to track per thread if the cleanup function had been registered.

Also, any solution which make the thread id random and not a bounded
set of values would cause the max memory foot print of the hash map to
be virtually unbounded and require that ERR_remove_thread_state be
invoked to clean up the error states.

-nate

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Bug that spans tomcat and tomcat-native

2016-06-27 Thread Nate Clark
On Mon, Jun 27, 2016 at 5:19 PM, Mark Thomas  wrote:
> On 27/06/2016 19:59, Nate Clark wrote:
>> On Mon, Jun 27, 2016 at 2:13 PM, Rainer Jung  wrote:
>
> 
>
>>> I wouldn't care too much about portability right now. If you can prove
>>> Mark's analysis by using gettid() on Linux and checking that this fixes the
>>> performance issue, that would very useful feedback. If it doesn't help, we
>>> can go down the path of using the hash statistics to get more info about
>>> what's happening. If it helps, we can think further about how to reduce hash
>>> collisions fo every platform.
>
> I went down the hash statistics route any way. The short version is with
> 11 threads and 8 buckets every thread ends up in the same bucket. This
> was on OSX. I'm pretty sure Linux will be the same.
>
>> I can try that, but unfortunately will not be able to run the test
>> right away. The machine I was using is a shared resource, which
>> somebody else needs for the next few days.
>
> Next up is trying to figure out a patch to fix the bucket distribution.
>
>> Even if that does help the performance and makes it so the hash map
>> has better distribution. It won't help with long term bloat. If a
>> system is running long enough and threads are started and stopped the
>> hash map could grow to include up to almost max pid entries, 32768 by
>> default on Linux. With the size of the ERR_STATE struct that will be
>> roughly 12MB of memory, not horrible but not ideal. I still think
>> ERR_remove_thread_state() should probably be invoked prior to a thread
>> exiting. The only way I can think of doing that and keeping everything
>> in native and not call it for every request would be to call
>> pthread_cleanup_push(ERR_remove_thread_state, NULL) the first time a
>> thread is used to perform any ssl operation. That would then require
>> you to track per thread if the cleanup function had been registered.
>
> That is definitely a secondary consideration at the moment.
>
>> Also, any solution which make the thread id random and not a bounded
>> set of values would cause the max memory foot print of the hash map to
>> be virtually unbounded and require that ERR_remove_thread_state be
>> invoked to clean up the error states.
>
> The function that generated the 'random' ID from the thread ID would
> have to be deterministic so the total number of values should remain
> bounded. Shouldn't it?
>

If the error queue's are not being removed that would be a
requirement. If it is to be bound it can not use
apr_os_thread_current(), since that is a "random" memory address and
inherently unbounded. If you tried to reduce that value to a smaller
set you would then have to worry about collisions which is something
any algorithm would have to worry about. The code in err.c assumes
that the error state is only ever modified by one thread and doesn't
lock modifying the state just the map access. The easiest thing to
base the ID on, from a Linux perspective, would be the TID, which is
bounded and guaranteed to be unique for a running thread. I am not as
sure about other operating systems. It seems like on some versions of
OSx pthread_getthreadid_np(), might do what you want.

-nate

-
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org



Re: Bug that spans tomcat and tomcat-native

2016-07-01 Thread Nate Clark
On Tue, Jun 28, 2016 at 12:37 PM, Rainer Jung  wrote:
> Am 28.06.2016 um 18:06 schrieb therealnewo...@gmail.com:
>>
>> On Tue, Jun 28, 2016 at 11:51 AM, Rainer Jung 
>> wrote:
>>>
>>> Am 28.06.2016 um 16:07 schrieb Mark Thomas:


 On 28/06/2016 12:28, Mark Thomas wrote:
>
>
> On 28/06/2016 11:34, Rainer Jung wrote:
>>
>>
>> Am 28.06.2016 um 11:15 schrieb Mark Thomas:
>
>
>
> 
>
>>> Index: src/ssl.c
>>> ===
>>> --- src/ssl.c(revision 1750259)
>>> +++ src/ssl.c(working copy)
>>> @@ -420,6 +420,10 @@
>>>  return psaptr->PSATOLD;
>>>  #elif defined(WIN32)
>>>  return (unsigned long)GetCurrentThreadId();
>>> +#elif defined(DARWIN)
>>> +uint64_t tid;
>>> +pthread_threadid_np(NULL, &tid);
>>> +return (unsigned long)tid;
>>>  #else
>>>  return (unsigned long)(apr_os_thread_current());
>>>  #endif
>>>
>>>
>>> I want to do some similar testing for Linux before adding what I
>>> suspect
>>> will be a very similar block using gettid().
>>
>>
>>
>> We could either add something to configure.in. Untested:
>>
>> Index: native/configure.in
>> ===
>> --- native/configure.in (revision 1750462)
>> +++ native/configure.in (working copy)
>> @@ -218,6 +218,9 @@
>>  *-solaris2*)
>>  APR_ADDTO(TCNATIVE_LIBS, -lkstat)
>>  ;;
>> +*linux*)
>> +APR_ADDTO(CFLAGS, -DTCNATIVE_LINUX)
>> +;;
>>  *)
>>  ;;
>>  esac
>>
>>
>> and then use a
>>
>> #ifdef TCNATIVE_LINUX
>>
>> or we copy some other more direct linux check from e.g. APR:
>>
>> #ifdef __linux__
>>
>> The latter looks simpler, but the version above is based on all the
>> logic put into config.guess.
>
>
>
> I'd go with the __linux__ option as that is consistent with what we
> already use in os/unix/system.c
>
> I'm not against the change to configure.in, I just think we should be
> consistent with how we do this throughout the code base.



 I've confirmed that the same problem occurs with hash bucket selection
 on linux and that switching to gettid() fixes that problem.

 I'm going to go ahead with the 1.2.8 release shortly. We can continue to
 refine this as necessary and have a more complete fix in 1.2.9.
>>>
>>>
>>>
>>> I did a quick check on Solaris. apr_os_thread_current() uses pthread_self
>>> on
>>> Solaris like on Linux (actually on any Unix type OS), but unlike Linux
>>> where
>>> this returns a address which is either 32 or 64 bit aligned depending on
>>> address size, on Solaris you get an increasing number starting with 1 for
>>> the first thread and incremented by one for each following thread. Thread
>>> IDs do not get reused in the same process, even if the thread finished,
>>> but
>>> thread IDs are common between different processes, because they always
>>> start
>>> with 1. So Solaris should be fine as-is.
>>
>>
>> Does the value have a cap? If not then Solaris will just continue to
>> use more and more memory as threads are created over the lifetime of
>> the server.
>
>
> No cap. I think everyone agrees we are still looking at cleaning up at
> thread death as a further improvement. Whether there's a problem without it
> depends on the executor size and whether the executor actually destroys and
> recreates threads (thus using more IDs than the max thread count).
>

Mark,

The change which was added to 1.2.8 works great on 1.1.34. I also
hacked in calling ERR_remove_thread_state on thread exit. It is
similar to my original change but only does do it on a threads exit,
but as Remy pointed out earlier this will only work if all IO
operations occur on one of the original threads from the thread pool.

-nate

Here is the hack just for completeness.

diff --git a/java/org/apache/catalina/core/StandardThreadExecutor.java
b/java/org/apache/catalina/core/StandardThreadExecutor.java
index a30f29c..80f429d 100644
--- a/java/org/apache/catalina/core/StandardThreadExecutor.java
+++ b/java/org/apache/catalina/core/StandardThreadExecutor.java
@@ -18,6 +18,7 @@
 package org.apache.catalina.core;

 import java.util.concurrent.RejectedExecutionException;
+import java.util.concurrent.ThreadFactory;
 import java.util.concurrent.TimeUnit;

 import org.apache.catalina.Executor;
@@ -27,10 +28,11 @@ import org.apache.catalina.util.LifecycleMBeanBase;
 import org.apache.tomcat.util.threads.ResizableExecutor;
 import org.apache.tomcat.util.threads.TaskQueue;
 import org.apache.tomcat.util.threads.TaskThreadFactory;
+import org.apache.tomcat.util.threads.ThreadCreator;
 import org.apache.tomcat.util.threads.ThreadPoolExecutor;

 public class StandardThreadExecu