The answer to whether Cyrus has anything to do with the TCP states is yes and no. The TCP states as seen in netstat are set by the kernel protocol code, but they are set based on what the user space program is doing on the socket. Basically the user space program, imapd in this case, has to play the game.
Looking at my netstat output again, I see more of what the problem is. There is still data on the Recv-Q. imapd must read this data before the kernel can complete closing the session down. For some reason it never does. I have now been running 2.1.2 on the exact same box as the 2.1.3 code was on (same conf files and everything) for going on 3 days and there has not been an occurrence of this problem. 2.1.3 was guaranteed to have had the problem many times in this time. A tcpdump and a netstat of the problem is below. ------------------------------- 16:07:27.304676 192.168.42.50.3299 > 192.168.42.4.imap: S 3650822516:3650822516(0) win 16384 <mss 1460,nop,nop,sackOK> (DF) 16:07:27.304746 192.168.42.4.imap > 192.168.42.50.3299: S 2681651357:2681651357(0) ack 3650822517 win 5840 <mss 1460,nop,nop,sackOK> (DF) 16:07:27.304886 192.168.42.50.3299 > 192.168.42.4.imap: . ack 1 win 17520 (DF) 16:07:27.307560 192.168.42.4.imap > 192.168.42.50.3299: P 1:62(61) ack 1 win 5840 (DF) 16:07:27.307827 192.168.42.50.3299 > 192.168.42.4.imap: P 1:18(17) ack 62 win 17459 (DF) 16:07:27.307861 192.168.42.4.imap > 192.168.42.50.3299: . ack 18 win 5840 (DF) 16:07:27.308209 192.168.42.4.imap > 192.168.42.50.3299: P 62:299(237) ack 18 win 5840 (DF) 16:07:27.308496 192.168.42.50.3299 > 192.168.42.4.imap: P 18:49(31) ack 299 win 17222 (DF) 16:07:27.310508 192.168.42.4.imap > 192.168.42.50.3299: P 299:323(24) ack 49 win 5840 (DF) 16:07:27.310732 192.168.42.50.3299 > 192.168.42.4.imap: P 49:60(11) ack 323 win 17198 (DF) 16:07:27.310861 192.168.42.4.imap > 192.168.42.50.3299: P 323:335(12) ack 60 win 5840 (DF) 16:07:27.312444 192.168.42.50.3299 > 192.168.42.4.imap: P 60:66(6) ack 335 win 17186 (DF) 16:07:27.312573 192.168.42.4.imap > 192.168.42.50.3299: P 335:354(19) ack 66 win 5840 (DF) 16:07:27.312906 192.168.42.50.3299 > 192.168.42.4.imap: P 66:84(18) ack 354 win 17167 (DF) 16:07:27.315684 192.168.42.4.imap > 192.168.42.50.3299: . 354:1814(1460) ack 84 win 5840 (DF) 16:07:27.315698 192.168.42.4.imap > 192.168.42.50.3299: P 1814:2014(200) ack 84 win 5840 (DF) 16:07:27.316051 192.168.42.50.3299 > 192.168.42.4.imap: . ack 354 win 17167 <nop,nop,sack sack 1 {1814:2014} > (DF) 16:07:27.515914 192.168.42.4.imap > 192.168.42.50.3299: . 354:1814(1460) ack 84 win 5840 (DF) 16:07:27.516311 192.168.42.50.3299 > 192.168.42.4.imap: . ack 2014 win 17520 (DF) 16:07:27.518882 192.168.42.50.3299 > 192.168.42.4.imap: P 84:113(29) ack 2014 win 17520 (DF) 16:07:27.555911 192.168.42.4.imap > 192.168.42.50.3299: . ack 113 win 5840 (DF) 16:08:27.511961 192.168.42.50.3299 > 192.168.42.4.imap: P 113:126(13) ack 2014 win 17520 (DF) 16:08:27.512011 192.168.42.4.imap > 192.168.42.50.3299: . ack 126 win 5840 (DF) 16:08:27.512039 192.168.42.50.3299 > 192.168.42.4.imap: F 126:126(0) ack 2014 win 17520 (DF) 16:08:27.545761 192.168.42.4.imap > 192.168.42.50.3299: . ack 127 win 5840 (DF) --------------------------------------------------- Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:32768 0.0.0.0:* LISTEN 594/rpc.statd tcp 0 0 127.0.0.1:32769 0.0.0.0:* LISTEN 820/xinetd tcp 0 0 0.0.0.0:143 0.0.0.0:* LISTEN 2667/master tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 566/portmap tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 961/httpd tcp 0 0 192.168.42.4:53 0.0.0.0:* LISTEN 762/named tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 762/named tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 787/sshd tcp 0 0 0.0.0.0:25 0.0.0.0:* LISTEN 998/master tcp 0 0 127.0.0.1:953 0.0.0.0:* LISTEN 762/named tcp 0 0 0.0.0.0:443 0.0.0.0:* LISTEN 961/httpd tcp 0 0 192.168.42.4:143 192.168.42.50:3251 ESTABLISHED 2677/imapd tcp 14 0 192.168.42.4:143 192.168.42.50:3299 CLOSE_WAIT 2679/imapd tcp 0 48 192.168.42.4:22 192.168.42.50:3009 ESTABLISHED 1212/sshd udp 0 0 0.0.0.0:32768 0.0.0.0:* 594/rpc.statd udp 0 0 0.0.0.0:32769 0.0.0.0:* 762/named udp 0 0 0.0.0.0:770 0.0.0.0:* 594/rpc.statd udp 0 0 192.168.42.4:53 0.0.0.0:* 762/named udp 0 0 127.0.0.1:53 0.0.0.0:* 762/named udp 0 0 0.0.0.0:111 0.0.0.0:* 566/portmap udp 0 0 192.168.42.4:123 0.0.0.0:* 709/ntpd udp 0 0 127.0.0.1:123 0.0.0.0:* 709/ntpd udp 0 0 0.0.0.0:123 0.0.0.0:* 709/ntpd > -----Original Message----- > From: Scott Adkins [mailto:[EMAIL PROTECTED]] > Sent: Saturday, 30 March 2002 1:02 a.m. > To: Jeremy Howard; Mike Brady > Cc: [EMAIL PROTECTED] > Subject: Re: Problem with imapd 2.1.3 not closing TCP session > > > I will post a problem report from what we have experienced on the Compaq > Alpha Tru64 system a little later this morning (I am pressed for time at > this very second). In any the case, we do experience the lossage of ports > from time to time, but I believe I understand why (which is what I will > explain). > > As for the CLOSE_WAIT and FIN stuff, I believe this is an > unrelated problem > to the above. In fact, I don't believe Cyrus has anything to do with TCP > states... the socket has either been closed or it has not been closed. I > can't seem to recall exactly, but I do not believe that even doing raw > sockets can you affect the state of TCP connections. This is an operating > system issue (and thus Linux's issue), and the problem is in the > kernel and > not the application. Unfortunately, I don't have any answers to this. > > Of course, I could be wrong, in which case I will probably see a lot of > corrections in my mailbox... > > Scott > > --On Friday, March 29, 2002 9:44 PM +1100 Jeremy Howard > <[EMAIL PROTECTED]> wrote: > > > Mike Brady wrote: > > > >> I am observing a problem where imapd occasionally does not > close the TCP > >> session properly. This only seems to occur with 2.1.3. > >> <...original details at end...> > >> > > We are also seeing an odd problem with 2.1.3. It may or may not be > > related to Mike's issue. We are using the skiplist backend for mailboxes > > and seen state. We have disabled the TLS session cache. We are using > > Linux 2.4.18 with Ext3. > > > > 2 times in the last week our IMAP server has suddenly stopped accepting > > IMAP connections on port 143 on its ethernet interface. However, it > > continues to accept IMAP connections on port 993 (SSL) and the localhost > > interface. Our cyrus.conf lines are: ---- > > imap cmd="imapd" listen="www.fastmail.fm:imap" prefork=20 > > imaplocal cmd="imapd" listen="[127.0.0.1]:imap" prefork=25 > > imaps cmd="imapd -s" listen="www.fastmail.fm:imaps" prefork=2 > > ---- > > > > Restarting Cyrus clears the problem. There are no unusual messages in > > imapd.conf or log/messages. We have plenty of spare system file > > descriptors. > > > > Both times this has happened has been at the busiest time of the day. > > However we had plenty of IO and CPU capacity spare at the time. > > > > It looks to me like somehow that particular port on that interface got > > 'filled up'. I'm not a TCP guru so I don't know exactly what that might > > look like--is there some limit on concurrent connections or a queue of > > waiting connections? When the lockup occured, telneting to the port > > simply resulted in nothing at all--it just sat there waiting for 5 > > minutes... > > > > Could Mike's problem report of TCP connections not being closed > correctly > > lead to this kind of lockup? > > > > ---- > > <The rest of Mike's message...> > > > > The details are as follows. > > > > My system is RedHat 7.2 with all appropriate errata except the kernel > > ones. The kernel is 2.4.18 compiled from tarball, but this issue also > > occurs with the Redhat 2.4.7-10 kernel. I am currently using Outlook > > 2000 on W2K Pro as my mail client (sorry :-). > > > > In addition I have compiled Postfix 1.1.4, Cyrus SASL 2.1.1 and > Cyrus IMAP > > 2.1.3 from their respective tarballs. > > > > The server is lightly loaded. This is my home server, I am the > only user > > and it is only doing my e-mail at the moment. > > > > For the most part mail works. That is, I can send and receive > e-mail with > > no problems. > > > > The symptom of the problem is that Outlook gets stuck while retrieving > > messages (I can't remember the exact message on the status bar). By > > stuck I mean that the Outlook UI is still accepting input, but > it doesn't > > do anything. The Outlook UI can be closed, but the Outlook processes > > does not die. The only way to get rid of it is to use Task Manager to > > kill it. > > > > On the server side netstat shows that the tcp session state is > CLOSE_WAIT. > > I have a tcpdump capture which shows that Outlook (well the tcp stack > > underneath it) has sent a FIN packet and imapd has acked it, > which is why > > it is in the CLOSE_WAIT state. So far so good. However, to > complete the > > tcp session close imapd should next send a FIN which Outlook should then > > ack. imapd never sends the FIN. I have waited for hours. The only way > > to get rid of the imapd process is to kill it, at which point it does > > send a RST packet. However, by this stage W2K/Outlook is so out to > > lunch the RST does not do anything. The only way to recover is to kill > > the Outlook.exe process. > > > > I have not been able to determine a way to make this happen. It has > > occurred within seconds of opening Outlook, but sometimes it > takes hours. > > > > My initial "figuring out Cyrus" install/testing was done with > 2.1.2 and I > > did not see this behaviour. 2.1.3 came out while I was doing through my > > learning phase, so when I did my real install I use that. I > went back to > > 2.1.2 on my live server about 24 hours ago and this problem has not > > occurred at all. > > > > It has been many years since I did any C coding so I haven't > tried to look > > at the source to figure this out. If there is something specific that I > > can look at or there is any more information that I can send please let > > me know. > > > > Thanks > > > > Mike Brady > > Auckland > > New Zealand > > > > > > > -- > +-=-=-=-=-=-=-=-=-=+=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=+=-=-=-=-=-=-=-=-+ > Scott W. Adkins http://www.cns.ohiou.edu/~sadkins/ > UNIX Systems Engineer mailto:[EMAIL PROTECTED] > ICQ 7626282 Work (740)593-9478 Fax (740)593-1944 > +-=-=-=-=-=-=-=-=-=+=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=+=-=-=-=-=-=-=-=-+ > PGP Public Key available at http://www.cns.ohiou.edu/~sadkins/pgp/