We could find no Java deadlocked threads at all by inspecting jconsole (not with the automatic "find deadlocks" functionality, nor by inspection of a thread dump). We performed several thread dumps WHILE the deadlock was clearly visible on screen (this was very easily reproduceable).

The deadlock is definitely there though and goes away as soon as we turn off "useAsyncIO". Since we could not find Java-level deadlocks, we believe the problem probably lies in the interaction with native code. We are using org.apache.catalina.core.AprLifecycleListener as well as Tomcat Native 1.2.23 on Linux. We could not find any pointers in the Tomcat Native changelog dealing with similar issues.

- Manuel Dominguez Sarmiento

On 16/07/2019 05:42, Mark Thomas wrote:
On July 16, 2019 7:20:28 AM UTC, "Rémy Maucherat" <r...@apache.org> wrote:
On Mon, Jul 15, 2019 at 11:30 PM Manuel Dominguez Sarmiento
<m...@renxo.com>
wrote:

Hi, we had been running Tomcat 9.0.17 for quite some time on our
high-load
production servers, using the attached server.xml configuration.

Upon upgrading to 9.0.21 we started experiencing many random
deadlocks. We
run performance advertising campaigns, and our conversion rates
dropped to
below half of what they usually are, which was an obvious consequence
of
our servers randomly locking up. Plus, it was very easy to reproduce
the
deadlocks, which seemed to "magically unlock" when opening a second
tab/window and opening the same URL that was locked on the other
window/tab. Doing this unlocked both windows/tabs at once,
immediately.
We found this was only happening on HTTPS, but NOT on HTTP.
Furthermore,
we found this was only happening when the browser negotiated an
upgrade to
HTTPS/2.0
Once we found this, we temporarily removed the <UpgradeProtocol
className="org.apache.coyote.http2.Http2Protocol" /> configuration,
and all
was back to normal.
However, we need HTTP/2, so we continued to look for a proper
solution.
Looking at the Tomcat changelog, we found there have been many
changes
since 9.0.17 related to useAsyncIO and HTTP/2. One particular change
for
9.0.22 caught our attention:


*"Remove a source of potential deadlocks when using HTTP/2 when the
Connector is configured with useAsyncIO as true. (markt)" *We also
found
the following discussion thread, which describes issues similar to
what we
were experiencing:


http://mail-archives.apache.org/mod_mbox/tomcat-dev/201906.mbox/%3c20190606204631.bab6c8a...@gitbox.apache.org%3e
So we upgraded to 9.0.22 thinking that the deadlock would be gone.
But
alas, it was not. The deadlocks remained.
We found that 9.0.20 changed the default for useAsyncIO from "false"
to
"true". So we changed useAsyncIO back to what it was when we were
running
9.0.17 (false) and all is back to normal on 9.0.22

So the conclusion is: there are still deadlock bugs on the NIO
connector
with useAsyncIO="true" and upgrades to HTTP/2.0
Besides fixing them, we believe that the useAsyncIO default should be
reverted to "false".

Looking forward to the team's feedback. Thanks,

You should investigate on the user list, or if you already have details
on
how to reproduce the deadlock and/or stack traces that show where it
occurs, you can create a BZ.
+1  a thread dump when the deadlock occurs with blocked thread(s) identified 
should be enough to figure out where things are going wrong.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org


Reply via email to