Re: NIO Thread Madness

2025-03-31 Thread Christopher Schultz

William,

On 3/31/25 2:31 PM, William Crowell wrote:

Question related to this.  I found issue DBCP-599 which was released
in DBCP 2.13.0 as part of Apache Tomcat 9.0.98 release.  The
characteristics of this major bug appear very similar to the issue I am
having.


Are you using an old driver that does not support JDBC 4.2?


In Tomcat’s lib directory there is a tomcat-dbcp.jar.  Is that just
DBCP2 repackaged under Tomcat or is that an interface from Tomcat
into Apache Commons DBCP2?


It's re-packaged.

-chris


From: William Crowell 
Date: Friday, March 28, 2025 at 12:22 PM
To: Tomcat Users List 
Subject: Re: NIO Thread Madness
Very good idea Chriis. Thank you.

Regards,

William Crowell

From: Christopher Schultz 
Date: Friday, March 28, 2025 at 12:05 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
William,

On 3/26/25 7:06 PM, William Crowell wrote:

That maxTotal was a typo due to trying to copy the config from a
screenshot.

I agree with your assessment.  I think there are 2 situations going
on.  One is that the code may not be properly closing connections,
and the connection pool is not properly configured.

I wrote this a looong time ago, but it's very helpful to show to
programmers who haven't been careful.

https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fblog.christopherschultz.net%2F2009%2F03%2F16%2Fproperly-handling-pooled-jdbc-connections%2F&data=05%7C02%7CWCrowell%40perforce.com%7C092af9879f8b47b68ff508dd6e14c1b7%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638787757657843607%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=F65r%2FaEPBooSj5KfFrbD1UYPNPyZefLbDxM%2BDoGIgDI%3D&reserved=0

For your long-transactions, I think you are going to want to use a
separate pool.

-chris


From: Christopher Schultz 
Date: Wednesday, March 26, 2025 at 6:57 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
William,

On 3/25/25 2:51 PM, William Crowell wrote:

Mark,

I think we might have found something.  I think the DBCP2 connection pool is 
returning stale connections from Oracle.  There is no firewall between Tomcat 
and Oracle, but I looked at the context.xml and found the following:

…
accessToUnderlyingConnectionAllowed=”true”
maxIdle=”100”
maxWaitMillis=”1”
minIdle=”25”
validationQuery=”select 1 from dual”
url=”blah blah blah”
maxTotal=”true”


^ This is typically a number, not a true/false value.


logAbandoned=”true”
removeAbandonedOnBorrow=”true”
removeAbandonedTimeout=”900”
removeAbandonedOnMaintenance=”true”
timeBetweenEvictionRunsMillis=”30” 

When the database pool dries up, your application will basically stop.
Setting removeAbandoned as you have is good, but with those long long
timeouts, they aren't doing any good.

I might set up a connection pool for "short queries" (like logging-in,
etc.) and then set up a separate pool for much longer queries.

Also, you might want to review your use of connections, etc. to ensure
that you are always closing everything in finally blocks. Resource
management with JDBC can be tedious and if you don't do it right, you
can leak connections from your pool very easily.

-chris


From: Mark Thomas 
Date: Tuesday, March 25, 2025 at 1:13 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
On 25/03/2025 12:33, William Crowell wrote:

Mark,

I believe there is a proxy involved here that does TLS decrypt, but I noticed 
they had the redirectPort on the 8080 connector set to 8443.  When you try to 
hit Tomcat directly over port 8080 using HTTP it is hung.


Hmm. Both the Acceptor thread and the Poller thread are running and
appear to be in states that would enable new requests to be processed.

There isn't logging in those two components for normal operations. I
assume because of performance concerns.

Do you have any thread dumps from when you have clients attempting new
requests that are hanging? I'm wondering if processing is hanging
somewhere in Tomcat / the application during request processing.

I would probably be thinking about enabling remote debugging and
connecting a debugger the next time it goes wrong and tracing the
progress of a new request. But I accept that may not be practical for a
production system.

It seems unlikely that the new requests are hanging before they reach
the Tomcat code but that seems unlikely.

Other thi

Re: NIO Thread Madness

2025-03-31 Thread Christopher Schultz

William,

On 3/31/25 3:59 PM, William Crowell wrote:

Oracle’s ojdbc8.jar version 19.25.0.0.0 which should support JDBC 4.2:


They I don't think you are experiencing what was reported in DBCP-599.

-chris


From: Christopher Schultz 
Date: Monday, March 31, 2025 at 3:50 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
William,

On 3/31/25 2:31 PM, William Crowell wrote:

Question related to this.  I found issue DBCP-599 which was released
in DBCP 2.13.0 as part of Apache Tomcat 9.0.98 release.  The
characteristics of this major bug appear very similar to the issue I am
having.


Are you using an old driver that does not support JDBC 4.2?


In Tomcat’s lib directory there is a tomcat-dbcp.jar.  Is that just
DBCP2 repackaged under Tomcat or is that an interface from Tomcat
into Apache Commons DBCP2?


It's re-packaged.

-chris


From: William Crowell 
Date: Friday, March 28, 2025 at 12:22 PM
To: Tomcat Users List 
Subject: Re: NIO Thread Madness
Very good idea Chriis. Thank you.

Regards,

William Crowell

From: Christopher Schultz 
Date: Friday, March 28, 2025 at 12:05 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
William,

On 3/26/25 7:06 PM, William Crowell wrote:

That maxTotal was a typo due to trying to copy the config from a
screenshot.

I agree with your assessment.  I think there are 2 situations going
on.  One is that the code may not be properly closing connections,
and the connection pool is not properly configured.

I wrote this a looong time ago, but it's very helpful to show to
programmers who haven't been careful.

https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fblog.christopherschultz.net%2F2009%2F03%2F16%2Fproperly-handling-pooled-jdbc-connections%2F&data=05%7C02%7CWCrowell%40perforce.com%7C5838853706b3439d53f708dd708d48a7%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638790474303279804%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=n52OlWTJYXuUvy%2FPepZNxRtHjWOQbQlmo6zskRW28JU%3D&reserved=0

For your long-transactions, I think you are going to want to use a
separate pool.

-chris


From: Christopher Schultz 
Date: Wednesday, March 26, 2025 at 6:57 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
William,

On 3/25/25 2:51 PM, William Crowell wrote:

Mark,

I think we might have found something.  I think the DBCP2 connection pool is 
returning stale connections from Oracle.  There is no firewall between Tomcat 
and Oracle, but I looked at the context.xml and found the following:

…
accessToUnderlyingConnectionAllowed=”true”
maxIdle=”100”
maxWaitMillis=”1”
minIdle=”25”
validationQuery=”select 1 from dual”
url=”blah blah blah”
maxTotal=”true”


^ This is typically a number, not a true/false value.


logAbandoned=”true”
removeAbandonedOnBorrow=”true”
removeAbandonedTimeout=”900”
removeAbandonedOnMaintenance=”true”
timeBetweenEvictionRunsMillis=”30” 

When the database pool dries up, your application will basically stop.
Setting removeAbandoned as you have is good, but with those long long
timeouts, they aren't doing any good.

I might set up a connection pool for "short queries" (like logging-in,
etc.) and then set up a separate pool for much longer queries.

Also, you might want to review your use of connections, etc. to ensure
that you are always closing everything in finally blocks. Resource
management with JDBC can be tedious and if you don't do it right, you
can leak connections from your pool very easily.

-chris


From: Mark Thomas 
Date: Tuesday, March 25, 2025 at 1:13 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
On 25/03/2025 12:33, William Crowell wrote:

Mark,

I believe there is a proxy involved here that does TLS decrypt, but I noticed 
they had the redirectPort on the 8080 connector set to 8443.  When you try to 
hit Tomcat directly over port 80

Re: NIO Thread Madness

2025-03-31 Thread William Crowell
Question related to this.  I found issue DBCP-599 which was released in DBCP 
2.13.0 as part of Apache Tomcat 9.0.98 release.  The characteristics of this 
major bug appear very similar to the issue I am having.

In Tomcat’s lib directory there is a tomcat-dbcp.jar.  Is that just DBCP2 
repackaged under Tomcat or is that an interface from Tomcat into Apache Commons 
DBCP2?

Regards,

William Crowell

From: William Crowell 
Date: Friday, March 28, 2025 at 12:22 PM
To: Tomcat Users List 
Subject: Re: NIO Thread Madness
Very good idea Chriis. Thank you.

Regards,

William Crowell

From: Christopher Schultz 
Date: Friday, March 28, 2025 at 12:05 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
William,

On 3/26/25 7:06 PM, William Crowell wrote:
> That maxTotal was a typo due to trying to copy the config from a
> screenshot.
>
> I agree with your assessment.  I think there are 2 situations going
> on.  One is that the code may not be properly closing connections,
> and the connection pool is not properly configured.
I wrote this a looong time ago, but it's very helpful to show to
programmers who haven't been careful.

https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fblog.christopherschultz.net%2F2009%2F03%2F16%2Fproperly-handling-pooled-jdbc-connections%2F&data=05%7C02%7CWCrowell%40perforce.com%7C092af9879f8b47b68ff508dd6e14c1b7%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638787757657843607%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=F65r%2FaEPBooSj5KfFrbD1UYPNPyZefLbDxM%2BDoGIgDI%3D&reserved=0

For your long-transactions, I think you are going to want to use a
separate pool.

-chris

> From: Christopher Schultz 
> Date: Wednesday, March 26, 2025 at 6:57 PM
> To: users@tomcat.apache.org 
> Subject: Re: NIO Thread Madness
> William,
>
> On 3/25/25 2:51 PM, William Crowell wrote:
>> Mark,
>>
>> I think we might have found something.  I think the DBCP2 connection pool is 
>> returning stale connections from Oracle.  There is no firewall between 
>> Tomcat and Oracle, but I looked at the context.xml and found the following:
>>
>> …
>> accessToUnderlyingConnectionAllowed=”true”
>> maxIdle=”100”
>> maxWaitMillis=”1”
>> minIdle=”25”
>> validationQuery=”select 1 from dual”
>> url=”blah blah blah”
>> maxTotal=”true”
>
> ^ This is typically a number, not a true/false value.
>
>> logAbandoned=”true”
>> removeAbandonedOnBorrow=”true”
>> removeAbandonedTimeout=”900”
>> removeAbandonedOnMaintenance=”true”
>> timeBetweenEvictionRunsMillis=”30” > minEvictableIdleTimeMillis=”90” >
>> The defaults for timeBetweenEvictionRunsMillis is 5 seconds and 
>> minEvictableIdleTimeMillis is 60 seconds, and I also see 
>> removeAbandonedTimeout is set to 15 minutes.  Some of the queries to the 
>> database can run over 10 minutes.  Sounds like an opportunity to recode this 
>> asynchronously.
>>
>> Why this would cause Tomcat to go dark is beyond me.  I will work on getting 
>> new thread dumps and stack traces.  Most of the long stack traces point to 
>> issues with the database, and they are being sent over as screen shots.  
>> I’ll see what I can do to work around that.
>
> When the database pool dries up, your application will basically stop.
> Setting removeAbandoned as you have is good, but with those long long
> timeouts, they aren't doing any good.
>
> I might set up a connection pool for "short queries" (like logging-in,
> etc.) and then set up a separate pool for much longer queries.
>
> Also, you might want to review your use of connections, etc. to ensure
> that you are always closing everything in finally blocks. Resource
> management with JDBC can be tedious and if you don't do it right, you
> can leak connections from your pool very easily.
>
> -chris
>
>> From: Mark Thomas 
>> Date: Tuesday, March 25, 2025 at 1:13 PM
>> To: users@tomcat.apache.org 
>> Subject: Re: NIO Thread Madness
>> On 25/03/2025 12:33, William Crowell wrote:
>>> Mark,
>>>
>>> I believe there is a proxy involved here that does TLS decrypt, but I 
>>> noticed they had the redirectPort on the 8080 connector set to 8443.  When 
>>> you try to hit Tomcat directly over port 8080 using HTTP it is hung.
>>
>> Hmm. Both the Acceptor thread and the Poller thread are running and
>> appear to be in states that would enable new requests to be processed.
>

Re: NIO Thread Madness

2025-03-31 Thread William Crowell
Chris,

Oracle’s ojdbc8.jar version 19.25.0.0.0 which should support JDBC 4.2:

https://www.oracle.com/database/technologies/maven-central-guide.html

Regards,

William Crowell

From: Christopher Schultz 
Date: Monday, March 31, 2025 at 3:50 PM
To: users@tomcat.apache.org 
Subject: Re: NIO Thread Madness
William,

On 3/31/25 2:31 PM, William Crowell wrote:
> Question related to this.  I found issue DBCP-599 which was released
> in DBCP 2.13.0 as part of Apache Tomcat 9.0.98 release.  The
> characteristics of this major bug appear very similar to the issue I am
> having.

Are you using an old driver that does not support JDBC 4.2?

> In Tomcat’s lib directory there is a tomcat-dbcp.jar.  Is that just
> DBCP2 repackaged under Tomcat or is that an interface from Tomcat
> into Apache Commons DBCP2?

It's re-packaged.

-chris

> From: William Crowell 
> Date: Friday, March 28, 2025 at 12:22 PM
> To: Tomcat Users List 
> Subject: Re: NIO Thread Madness
> Very good idea Chriis. Thank you.
>
> Regards,
>
> William Crowell
>
> From: Christopher Schultz 
> Date: Friday, March 28, 2025 at 12:05 PM
> To: users@tomcat.apache.org 
> Subject: Re: NIO Thread Madness
> William,
>
> On 3/26/25 7:06 PM, William Crowell wrote:
>> That maxTotal was a typo due to trying to copy the config from a
>> screenshot.
>>
>> I agree with your assessment.  I think there are 2 situations going
>> on.  One is that the code may not be properly closing connections,
>> and the connection pool is not properly configured.
> I wrote this a looong time ago, but it's very helpful to show to
> programmers who haven't been careful.
>
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fblog.christopherschultz.net%2F2009%2F03%2F16%2Fproperly-handling-pooled-jdbc-connections%2F&data=05%7C02%7CWCrowell%40perforce.com%7C5838853706b3439d53f708dd708d48a7%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638790474303279804%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=n52OlWTJYXuUvy%2FPepZNxRtHjWOQbQlmo6zskRW28JU%3D&reserved=0
>
> For your long-transactions, I think you are going to want to use a
> separate pool.
>
> -chris
>
>> From: Christopher Schultz 
>> Date: Wednesday, March 26, 2025 at 6:57 PM
>> To: users@tomcat.apache.org 
>> Subject: Re: NIO Thread Madness
>> William,
>>
>> On 3/25/25 2:51 PM, William Crowell wrote:
>>> Mark,
>>>
>>> I think we might have found something.  I think the DBCP2 connection pool 
>>> is returning stale connections from Oracle.  There is no firewall between 
>>> Tomcat and Oracle, but I looked at the context.xml and found the following:
>>>
>>> …
>>> accessToUnderlyingConnectionAllowed=”true”
>>> maxIdle=”100”
>>> maxWaitMillis=”1”
>>> minIdle=”25”
>>> validationQuery=”select 1 from dual”
>>> url=”blah blah blah”
>>> maxTotal=”true”
>>
>> ^ This is typically a number, not a true/false value.
>>
>>> logAbandoned=”true”
>>> removeAbandonedOnBorrow=”true”
>>> removeAbandonedTimeout=”900”
>>> removeAbandonedOnMaintenance=”true”
>>> timeBetweenEvictionRunsMillis=”30” >> minEvictableIdleTimeMillis=”90” >>
>>> The defaults for timeBetweenEvictionRunsMillis is 5 seconds and 
>>> minEvictableIdleTimeMillis is 60 seconds, and I also see 
>>> removeAbandonedTimeout is set to 15 minutes.  Some of the queries to the 
>>> database can run over 10 minutes.  Sounds like an opportunity to recode 
>>> this asynchronously.
>>>
>>> Why this would cause Tomcat to go dark is beyond me.  I will work on 
>>> getting new thread dumps and stack traces.  Most of the long stack traces 
>>> point to issues with the database, and they are being sent over as screen 
>>> shots.  I’ll see what I can do to work around that.
>>
>> When the database pool dries up, your application will basically stop.
>> Setting removeAbandoned as you have is good, but with those long long
>> timeouts, the