Hi Has anyone seen anything like this before? Is there any more information I could provide to help to track this problem down?
We're running Cyrus 2.1.x on FreeBSD 4.11-RELEASE. During the busiest time of the day we get up to about 2000 concurrent connections to the imaps port. Sometimes we suddenly get a huge spike of connections (e.g. 4000 in total). netstat reports the "extra" connections as being in the CLOSE_WAIT state. (i.e. the client has closed its side of the connection, but the imapd has not yet closed its end.) When this happens, there are still a bunch of imapds that appear to be working fine. i.e. they don't all stop working at once. This is what it looks like: Active Internet connections Proto Recv-Q Send-Q Local Address Foreign Address (state) [...] tcp4 102 0 xxx.yyy.zzz.www.993 xxx.yyy.65.101.1480 CLOSE_WAIT tcp4 105 0 xxx.yyy.zzz.www.993 xxx.yyy.5.211.4289 CLOSE_WAIT tcp4 0 0 xxx.yyy.zzz.www.993 xxx.yyy.35.232.1721 CLOSE_WAIT tcp4 84 0 xxx.yyy.zzz.www.993 xxx.yyy.28.3.3480 CLOSE_WAIT tcp4 98 0 xxx.yyy.zzz.www.993 xxx.yyy.66.14.50285 CLOSE_WAIT tcp4 0 0 xxx.yyy.zzz.www.993 xxx.yyy.28.34.1373 CLOSE_WAIT tcp4 0 0 xxx.yyy.zzz.www.993 xxx.yyy.35.232.3552 CLOSE_WAIT tcp4 57 32147 xxx.yyy.zzz.www.993 xxx.yyy.5.182.1316 CLOSE_WAIT Most of the CLOSE_WAIT connections appear to have a small amount of data in the receive queue and nothing in the send queue. As soon as the imapd reads from the connection (maybe twice) it will notice that the connection has been closed and, presumably close its end too, but these imapds seem to be stuck in a select loop that looks like this: # truss -p 85462 (null)() = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) select(0x0,0x0,0x0,0x0,0xbfbfe808) = 0 (0x0) ^C If I run lsof on one of the imapds I see that it still has file descriptors attached to the connection to the client: # lsof -i @xxx.yyy.65.101:1480 COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME imapd 85462 cyrus 0u IPv4 0xef4a0c00 0t0 TCP xxx.yyy.zzz.www:imaps->xxx.yyy.65.101:1480 (CLOSE_WAIT) imapd 85462 cyrus 1u IPv4 0xef4a0c00 0t0 TCP xxx.yyy.zzz.www:imaps->xxx.yyy.65.101:1480 (CLOSE_WAIT) imapd 85462 cyrus 2u IPv4 0xef4a0c00 0t0 TCP xxx.yyy.zzz.www:imaps->xxx.yyy.65.101:1480 (CLOSE_WAIT) Thanks in advance for any advice. -- Michael Wood <[EMAIL PROTECTED]> --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html