Here's this morning's failure. For testing, I've set http-timeout=67. I am trying to see if the zero window is a coincidence or not.
The debug output: Reading 8192 bytes of response body. Got 8192 bytes. XML: Parsing 8192 bytes. XML: char-data (235) returns 0 XML: xmlParseChunk returned 0 Reading 8192 bytes of response body. Got 6180 bytes. XML: Parsing 6180 bytes. XML: char-data (235) returns 0 XML: xmlParseChunk returned 0 Reading 8192 bytes of response body. Aborted request (-3): Could not read response body sess: Closing connection. sess: Connection closed. Request ends, status 200 class 2xx, error line: Could not read response body: connection was closed by server Running destroy hooks. Request ends. svn: E175002: REPORT of '/svn/repos/deckard-65x/!svn/me': Could not read response body: connection was closed by server (http://foundry51.qnx.com) sess: Destroying session. sess: Destroying session. The network capture: 391174 498.610439000 10.222.3.88 10.14.11.50 HTTP 1434 Continuation or non-HTTP traffic 391175 498.610495000 10.14.11.50 10.222.3.88 TCP 54 37480 > http [ACK] Seq=1945767 Ack=480574851 Win=1280 Len=0 391176 498.842179000 10.222.3.88 10.14.11.50 HTTP 1334 [TCP Window Full] Continuation or non-HTTP traffic 391177 499.042264000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391178 499.297972000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391179 499.298065000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391180 499.796762000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391181 499.796811000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391182 500.778349000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391183 500.778405000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391184 502.728191000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391185 502.728260000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391186 506.614031000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391187 506.614093000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391188 514.371638000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391189 514.371694000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391190 529.872211000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391191 *REF* 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391192 30.988601000 10.222.3.88 10.14.11.50 TCP 60 [TCP Keep-Alive] http > 37480 [ACK] Seq=480576130 Ack=1945767 Win=908288 Len=0 391193 30.988645000 10.14.11.50 10.222.3.88 TCP 54 [TCP ZeroWindow] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=0 Len=0 391194 71.282655000 10.14.11.50 10.222.3.88 TCP 54 [TCP Window Update] 37480 > http [ACK] Seq=1945767 Ack=480576131 Win=1171456 Len=0 391195 71.296598000 10.222.3.88 10.14.11.50 HTTP 1434 Continuation or non-HTTP traffic which results in: 392598 71.906863000 10.222.3.88 10.14.11.50 HTTP 1434 Continuation or non-HTTP traffic 392599 71.906925000 10.14.11.50 10.222.3.88 TCP 54 37480 > http [ACK] Seq=1945767 Ack=482310892 Win=987904 Len=0 392600 72.043276000 10.14.11.50 10.222.3.88 TCP 54 [TCP Window Update] 37480 > http [ACK] Seq=1945767 Ack=482310892 Win=1188096 Len=0 392601 72.375325000 10.14.11.50 10.222.3.88 TCP 54 37480 > http [FIN, ACK] Seq=1945767 Ack=482310892 Win=1188096 Len=0 392602 72.388690000 10.222.3.88 10.14.11.50 TCP 60 http > 37480 [ACK] Seq=482310892 Ack=1945768 Win=908288 Len=0 There are two interesting things: The long gap after the *REF* packet is unusual, it's the first time I've seen that. The FIN packet is 72 seconds later, not 67 like I set the timeout. OTOH, it's close, and, given the long gap no packets occur at FIN - 67 seconds. I'm going to keep trying random timeout values. ...Stu On Tue, Jan 14, 2014 at 9:27 AM, Stuart MacDonald <[email protected]>wrote: > I don't have access to the server log, but you can see from the wireshark > capture that the *client* closed the connection (look for the RST packet), > not the server. The error of "socket closed" is consistent with that. > > I suspect that there's an issue handling the "http-timeout" timer; it gets > triggered during a TCP zero window situation, and then not canceled > properly, and then it fires, closing the socket. > > That would also fit the data of the bug being in 1.6 (reported by someone > else, but I linked to it in my original post), 1.7 and 1.8; it's a bug in > svn's handling of the underlying HTTP interaction layer, and not in that > layer or the server or somewhere else. > > I could be wrong. When I look at the evidence so far, that's what I see. > > ...Stu > > > On Tue, Jan 14, 2014 at 6:33 AM, Philip Martin <[email protected] > > wrote: > >> [email protected] writes: >> >> > I have another failure, with some debug info, and some updates. >> > >> > We're switching to a different system for this particular repository, >> so I >> > may or may not be able to continue debug efforts. :-( >> > >> > Since there's no debug info available for the svn 1.8 client, I haven't >> > been running that. >> > >> > I went a full week without a failure, which I suspect is due to the >> > enormous amount of debugging that neon-debug-mask=255 produces, so this >> > week I dialed it back to 127 and got a failure on the first try. >> > >> > Here's the latest failure. First up, the debug info: >> > Got 8192 bytes. >> > XML: Parsing 8192 bytes. >> > XML: char-data (235) returns 0 >> > XML: xmlParseChunk returned 0 >> > Reading 8192 bytes of response body. >> > Got 6600 bytes. >> > XML: Parsing 6600 bytes. >> > XML: char-data (235) returns 0 >> > XML: xmlParseChunk returned 0 >> > Reading 8192 bytes of response body. >> > Aborted request (-3): Could not read response body >> >> That message comes from ne_request.c and the -3 is NE_SOCK_CLOSED >> defined in ne_socket.h: >> >> #define NE_SOCK_CLOSED (-3) >> /* Connection was reset (e.g. server crashed) */ >> >> > sess: Closing connection. >> > sess: Connection closed. >> > Request ends, status 200 class 2xx, error line: >> > Could not read response body: connection was closed by server >> > Running destroy hooks. >> > Request ends. >> > svn: E175002: REPORT of '/svn/repos/<name>/!svn/me': Could not read >> > response body: connection was closed by server (http://<server>) >> > sess: Destroying session. >> > sess: Destroying session. >> >> What does the server error log show? >> >> -- >> Philip Martin | Subversion Committer >> WANdisco // *Non-Stop Data* >> > >
