https://bz.apache.org/bugzilla/show_bug.cgi?id=61736

            Bug ID: 61736
           Summary: NIO BlockPoller may stay at events() method for too
                    long time
           Product: Tomcat 8
           Version: 8.0.47
          Hardware: PC
                OS: Mac OS X 10.1
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: Connectors
          Assignee: dev@tomcat.apache.org
          Reporter: sz...@qq.com
  Target Milestone: ----

Created attachment 35508
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=35508&action=edit
An Http Client which is able to send data in controlled packet size and packet
interval.

When using tomcat in NIO mode as a server to receive files, and if the files
are transferred via a non-stable network (there is an obvious interval between
neighbor TCP packets), then the traffic always hang for a great while with zero
window reported from server. Following are the steps to reproduce the problem.

1. Configure the server tcp receive window size.
$ cat /proc/sys/net/ipv4/tcp_rmem
1024    2048    8192

2. Put the following JSP into the ROOT project. (in the path
$CATALINA_HOME/webapps/ROOT/consume.jsp)
<%@ page import="org.apache.commons.io.IOUtils" %><%
    final ServletInputStream inputStream = request.getInputStream();
    byte[] content = IOUtils.toByteArray(inputStream);
%>ok
The IOUtils is a class from commons-io

3. Start tomcat in NIO mode with the following settings
Server version:        Apache Tomcat/8.0.46
Server built:          Aug 10 2017 10:10:31 UTC
Server number:         8.0.46.0
OS Name:               Linux
OS Version:            3.10.0-229.el7.x86_64
Architecture:          amd64
Java Home:             /usr/jdk1.8.0_121/jre
JVM Version:           1.8.0_121-b13
JVM Vendor:            Oracle Corporation
Command line argument: -Xms256m
Command line argument: -Xmx256m
Command line argument: -Xmn128m
Command line argument: -Xss1m

4. Capture the TCP traffics.
tcpdump -i any -w tcp.cap port 8080

5. Start a client. sending data with http in chunks with the following
settings:
chunk size                   1024 byte
time to rest between chunks  50 ms
chunk count per request      100
parallel requests            100
total requests               10000

6. Then we will got a lot of zero window packets, which lasts several seconds.
No.     stream     Time               Source              Destination        
ack        win        Length Info
 469817 3201       15:09:04.175309    172.16.1.4          172.16.1.5          1
         29696      57     [TCP segment of a reassembled PDU]
 469904 3201       15:09:04.214945    172.16.1.5          172.16.1.4         
4238       1024       54     8080→58750 [ACK] Seq=1 Ack=4238 Win=1024 Len=0
 470091 3201       15:09:04.437137    172.16.1.4          172.16.1.5          1
         29696      1078   [TCP Window Full] [TCP segment of a reassembled PDU]
 470092 3201       15:09:04.437142    172.16.1.5          172.16.1.4         
5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262
Win=0 Len=0
 470334 3201       15:09:04.657120    172.16.1.4          172.16.1.5          1
         29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1
Win=29696 Len=0
 470335 3201       15:09:04.657123    172.16.1.5          172.16.1.4         
5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262
Win=0 Len=0
 470620 3201       15:09:05.098135    172.16.1.4          172.16.1.5          1
         29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1
Win=29696 Len=0
 470621 3201       15:09:05.098141    172.16.1.5          172.16.1.4         
5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262
Win=0 Len=0
 471017 3201       15:09:05.979136    172.16.1.4          172.16.1.5          1
         29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1
Win=29696 Len=0
 471018 3201       15:09:05.979140    172.16.1.5          172.16.1.4         
5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262
Win=0 Len=0
 471619 3201       15:09:07.743148    172.16.1.4          172.16.1.5          1
         29696      54     [TCP Keep-Alive] 58750→8080 [ACK] Seq=5261 Ack=1
Win=29696 Len=0
 471620 3201       15:09:07.743151    172.16.1.5          172.16.1.4         
5262       0          54     [TCP ZeroWindow] 8080→58750 [ACK] Seq=1 Ack=5262
Win=0 Len=0
 475765 3201       15:09:08.545625    172.16.1.5          172.16.1.4         
5262       3072       54     [TCP Window Update] 8080→58750 [ACK] Seq=1
Ack=5262 Win=3072 Len=0
 475781 3201       15:09:08.545815    172.16.1.4          172.16.1.5          1
         29696      490    [TCP segment of a reassembled PDU]
 475782 3201       15:09:08.545821    172.16.1.5          172.16.1.4         
5698       3072       54     8080→58750 [ACK] Seq=1 Ack=5698 Win=3072 Len=0
 475784 3201       15:09:08.545825    172.16.1.4          172.16.1.5          1
         29696      1514   [TCP segment of a reassembled PDU]


After a lot of study, I found this is because of a BUG in the BlockPoller,
which will slow down the speed of polling cycle a lot.

The BlockPoller will always try to poll the Runnable event and run it until the
events queue is empty. But the RunnableAdd event may wake the Worker thread,
and the worker thread may add another more event to the queue. Thus the queue
will be hard to shrink, and the events method will be hard to return and will
affect the latency the next selection to be performed.

// events() method of the BlockPoller thread
public boolean events() {
    boolean result = false;
    Runnable r = null;
    result = (events.size() > 0);
    while ( (r = events.poll()) != null ) {
        // running the "r" event may cause more events to 
        // be added to the queue. then the queue will be hard 
        // to shrink, and the loop may be hard to finish
        r.run();
        result = true;
    }
    return result;
}

// the following e1 and e2 line in the RunnableAdd will wake the worker thread
public void run() {
    SelectionKey sk = ch.keyFor(selector);
    try {
        if (sk == null) {
            sk = ch.register(selector, ops, key);
            ref.key = sk;
        } else if (!sk.isValid()) {
            cancel(sk, key, ops); // <-- e1
        } else {
            sk.interestOps(sk.interestOps() | ops);
        }
    } catch (CancelledKeyException cx) {
        cancel(sk, key, ops); // <-- e2
    } catch (ClosedChannelException cx) {
        cancel(null, key, ops);
    }
}

// when the worker is waken at line L3, it will start the next loop
// and may add another event at line L2
public int read(ByteBuffer buf, NioChannel socket, long readTimeout) throws
IOException {
    SelectionKey key =
socket.getIOChannel().keyFor(socket.getPoller().getSelector());
    if ( key == null ) throw new IOException("Key no longer registered");
    KeyReference reference = keyReferenceStack.pop();
    if (reference == null) {
        reference = new KeyReference();
    }
    NioSocketWrapper att = (NioSocketWrapper) key.attachment();
    int read = 0;
    boolean timedout = false;
    int keycount = 1; 
    long time = System.currentTimeMillis(); 
    try {
L1:     while(!timedout) {
            if (keycount > 0) {
                read = socket.read(buf);
                if (read != 0) {
                    break;
                }
            }
            try {
                if ( att.getReadLatch()==null ||
att.getReadLatch().getCount()==0) att.startReadLatch(1);
L2:             poller.add(att,SelectionKey.OP_READ, reference);
                if (readTimeout < 0) {
                    att.awaitReadLatch(Long.MAX_VALUE, TimeUnit.MILLISECONDS);
                } else {
L3:                 att.awaitReadLatch(readTimeout, TimeUnit.MILLISECONDS);
                }
            } catch (InterruptedException ignore) {
                // Ignore
            }
            if ( att.getReadLatch()!=null && att.getReadLatch().getCount()> 0)
{
                keycount = 0;
            }else {
                keycount = 1;
                att.resetReadLatch();
            }
            if (readTimeout >= 0 && (keycount == 0))
                timedout = (System.currentTimeMillis() - time) >= readTimeout;
        } //while
        if (timedout)
            throw new SocketTimeoutException();
    } finally {
        poller.remove(att,SelectionKey.OP_READ);
        if (timedout && reference.key!=null) {
            poller.cancelKey(reference.key);
        }
        reference.key = null;
        keyReferenceStack.push(reference);
    }
    return read;
}

In fact there is such an occasion where the previous case will happen:

 1. [Worker]: added an event, and wait on the readLatch
 2. [BlockPoller]: registered operation interest on corresponding SelectionKey
 3. [BlockPoller]: perform selection
 4. [BlockPoller]: wake the Worker on the selection result
 5. [Worker]: wake up, read data success
 6. [Worker]: add remove event (e1) to the queue, and return from the read
method
 7. [Worker]: Outside call the read method again
 8. [Worker]: read data and got nothing
 9. [Worker]: add an event (e2) via the poller.add(att, ops, ref) method
10. [Worker]: wait on the readLatch
11. [BlockPoller]: enter next loop, and execute the events() method, polling
events from the events queue and run them.
12. [BlockPoller]: got e1 and run it, the SelectionKey turn invalid (before
next selection, the SelectionKey is always invalid)
13. [BlockPoller]: got e2 and run it, since the SelectionKey is invalid, it
will wake the worker thread, and discard the event.
14. [BlockPoller]: poll other events from the queue, and run them
15. [Worker]: (before queue empty) waken up, and read again, but got nothing
16. [Worker]: add another event (e3) via the poller.add(att, ops, ref) method
17. [Worker]: wait on the readLatch again
18. [BlockPoller]: got e3 and run it, found it invalid again, wake the worker
thread, and discard the event.
19. [Worker]: being waken up again and add event again...


steps between 11 to 17 will repeat again and again (step 18 and later is a
replay of the steps), if each time the event add in step 16 add the event to
the queue before the queue is consumed to empty, because the BlockPoller will
always try to empty the queue. 

Then the BlockPoller and the worker will enter a strange loop, where the
BlockPoller will try to empty the queue, while the worker will always add some
more events to the queue.

My solution to prevent this problem, is to prevent the BlockPoller from trying
to empty the queue before move to the selection step, instead, make it only
handle events in the queue when it starts to handle them, further events added
later will not be handled during this BlockPoller loop, but instead delay to
the next loop. The selection operation will remove all invalid SelectionKeys,
so in the time of the next loop, the problem will not happen again, since the
SelectionKeys are all fresh and valid.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to