Ben Greear wrote:
I haven't seen this problem on 2.6.13, so I'm now starting a manual bisect to see if I can narrow down where the problem appeared.
Turns out, I can reproduce it in 2.6.13, and 2.6.9. I haven't tried anything older. I also tried to reproduce it using a simpler traffic generation tool, but could not reproduce the problem with it. That points to something wierd that my application is doing, but I can't imagine what user-space could do to screw up a TCP connection like this. In all cases, there is a lot of data in the send-queue, but for whatever reason, the connection will not make progress. To user-space, it appears that poll returns neither readable nor writable for the sockets. I notice that if I increase the send-buffer-size while the connection is in the hung state, my app will quickly fill the larger send buffer, but still receives nothing new. Starting a new connection on the same interfaces works for a few seconds and then hangs as well, so the NICs can pass traffic. Here is output from /proc/net/tcp and netstat from the 2.6.16.16 kernel. netstat info: tcp 0 5635368 172.1.5.169:33058 172.1.5.168:33057 ESTABLISHED tcp 0 5987504 172.1.5.168:33057 172.1.5.169:33058 ESTABLISHED /proc/net/tcp: 20: A90501AC:8122 A80501AC:8121 01 0055FD28:00000000 01:00001A9F 0000000A 0 0 21309 2 f36d8580 120000 40 0 1 58 21: A80501AC:8121 A90501AC:8122 01 005B5CB0:00000000 01:00001C9D 0000000A 0 0 21226 3 ef7bfa80 120000 40 0 1 35 -- Ben Greear <[EMAIL PROTECTED]> Candela Technologies Inc http://www.candelatech.com - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html