https://bz.apache.org/bugzilla/show_bug.cgi?id=64621

            Bug ID: 64621
           Summary: HTTP/2 Tomcat Server responds with RST_STREAM
                    (REFUSED_STREAM) continuously in one of the TCP
                    connection.
           Product: Tomcat 9
           Version: 9.0.x
          Hardware: HP
                OS: Linux
            Status: NEW
          Severity: critical
          Priority: P2
         Component: Catalina
          Assignee: dev@tomcat.apache.org
          Reporter: raghavendra...@ericsson.com
  Target Milestone: -----

*Sub-Component - Coyote*

*OS : Redhat Linux*

*Description:*

*Setup:*

We have an implementation with Tomcat to transport http/2 packets between 2
systems. 

*Issue reproduction:*

During a load test (with some 1000 requests per second) in the pre-production
systems, we identified some scenarios where our Tomcat server is continuously
replying with a RST_STREAM, with REFUSED_STREAM as the reason, in a single TCP
Connection. 

Even though it is understood that exceeding the number of streams per
connection rejects the new streams with RST_STREAM (REFUSED_STREAM) error as
per Section 5.1.2 and SECTION 8.1.4 of RFC7540. We are not seeing any other
responses sent by the server after sometime. It was only RST_STREAM
(REFUSED_STREAM) errors. For the whole test run period (around 30 minutes), it
was only RST_STREAM in that particular TCP connection.

If the RST_STREAM is sent for exceeded stream count, should it have not
recovered after sometime? 


*Additional Information:*

Also, by looking into the tomcat code base, it was understood that the
RST_STREAM (REFUSED_STREAM) response from tomcat is only possible during an
exceeded stream count and in no other situation (reference given below).

if (localSettings.getMaxConcurrentStreams() <
activeRemoteStreamCount.incrementAndGet()) {
                activeRemoteStreamCount.decrementAndGet();
                throw new
StreamException(sm.getString("upgradeHandler.tooManyRemoteStreams",
                       
Long.toString(localSettings.getMaxConcurrentStreams())),
                        Http2Error.REFUSED_STREAM, streamId);
            }

In addition to this, in the Stream processor code, we noticed a ‘FIXME:’
comment to fix the syncs (reference given below).

final void process(SocketEvent event) {
        try {
            // FIXME: the regular processor syncs on socketWrapper, but here
this deadlocks
            synchronized (this) {
                // HTTP/2 equivalent of AbstractConnectionHandler#process()
without the
                // socket <-> processor mapping
                ContainerThreadMarker.set();
                SocketState state = SocketState.CLOSED;
                try {
                    state = process(socketWrapper, event);

                    if (state == SocketState.CLOSED) {
                        if (!getErrorState().isConnectionIoAllowed()) {
                            ConnectionException ce = new
ConnectionException(sm.getString(
                                    "streamProcessor.error.connection",
stream.getConnectionId(),
                                    stream.getIdentifier()),
Http2Error.INTERNAL_ERROR);
                            stream.close(ce);
                        } else if (!getErrorState().isIoAllowed()) {
                            StreamException se = stream.getResetException();
                            if (se == null) {
                                se = new StreamException(sm.getString(
                                        "streamProcessor.error.stream",
stream.getConnectionId(),
                                        stream.getIdentifier()),
Http2Error.INTERNAL_ERROR,
                                        stream.getIdAsInt());
                            }
                            stream.close(se);
                        }
                    }
                } catch (Exception e) {
                    String msg =
sm.getString("streamProcessor.error.connection",
                            stream.getConnectionId(), stream.getIdentifier());
                    if (log.isDebugEnabled()) {
                        log.debug(msg, e);
                    }
                    ConnectionException ce = new ConnectionException(msg,
Http2Error.INTERNAL_ERROR);
                    ce.initCause(e);
                    stream.close(ce);
                } finally {
                    ContainerThreadMarker.clear();
                }
            }
        } finally {
            handler.executeQueuedStream();
        }
    }

Correlating these two items, if the streams are not closed due to this sync
issue, the activeRemoteStreamCount will be in an increasing trend (as I am
seeing the stream closure only here), Which will result in the REFUSED_STREAM
errors. 

Can you confirm this issue and suggest on whether a fix is available in any of
your working streams? As the system is already in production, it is susceptible
to the issue in production anytime, when the load increases. Please help us
find a way forward. 

Thanks in advance.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to