On Mon, Jul 15, 2019 at 11:30 PM Manuel Dominguez Sarmiento <m...@renxo.com> wrote:
> Hi, we had been running Tomcat 9.0.17 for quite some time on our high-load > production servers, using the attached server.xml configuration. > > Upon upgrading to 9.0.21 we started experiencing many random deadlocks. We > run performance advertising campaigns, and our conversion rates dropped to > below half of what they usually are, which was an obvious consequence of > our servers randomly locking up. Plus, it was very easy to reproduce the > deadlocks, which seemed to "magically unlock" when opening a second > tab/window and opening the same URL that was locked on the other > window/tab. Doing this unlocked both windows/tabs at once, immediately. > > We found this was only happening on HTTPS, but NOT on HTTP. Furthermore, > we found this was only happening when the browser negotiated an upgrade to > HTTPS/2.0 > Once we found this, we temporarily removed the <UpgradeProtocol > className="org.apache.coyote.http2.Http2Protocol" /> configuration, and all > was back to normal. > However, we need HTTP/2, so we continued to look for a proper solution. > > Looking at the Tomcat changelog, we found there have been many changes > since 9.0.17 related to useAsyncIO and HTTP/2. One particular change for > 9.0.22 caught our attention: > > > *"Remove a source of potential deadlocks when using HTTP/2 when the > Connector is configured with useAsyncIO as true. (markt)" *We also found > the following discussion thread, which describes issues similar to what we > were experiencing: > > http://mail-archives.apache.org/mod_mbox/tomcat-dev/201906.mbox/%3c20190606204631.bab6c8a...@gitbox.apache.org%3e > > So we upgraded to 9.0.22 thinking that the deadlock would be gone. But > alas, it was not. The deadlocks remained. > We found that 9.0.20 changed the default for useAsyncIO from "false" to > "true". So we changed useAsyncIO back to what it was when we were running > 9.0.17 (false) and all is back to normal on 9.0.22 > > So the conclusion is: there are still deadlock bugs on the NIO connector > with useAsyncIO="true" and upgrades to HTTP/2.0 > Besides fixing them, we believe that the useAsyncIO default should be > reverted to "false". > > Looking forward to the team's feedback. Thanks, > You should investigate on the user list, or if you already have details on how to reproduce the deadlock and/or stack traces that show where it occurs, you can create a BZ. Rémy