Hi All, We just setup two new mail servers on Redhat 7.0 (2.2.19 kernel) using:
Cyrus 2.0.16 SASL 1.5.24 (from source) Berkeley DB 3.1.17-5 (redhat rpm) Authentication via LDAP openLDAP 1.2.11-15 (redhat RPM) using a hacked version of LDAP pwcheck from http://www.linc-dev.com/Files/pwcheck_ldap.c.txt MTA is qmail-1.03 managed by daemontools-0.76 with tcpserver from ucspi-tcp-0.88 Qmail calls Cyrus's deliver through a shell script. Hardware HP LH4 Dual P3 Xeon - 500Mhz with (2GB, alas only 1GB usable), 12 - 18GB 15K RPM Ultra3 (160Mb/s) drives in a RAID 5 array and a Gig Ethernet card. In addition to this, I am running websieve, easysieve, IMP 2.2.5 for webmail, mrtg and qmail-mrtg for monitoring and ezmlm 0.53 with ezmlm-idx for mailing lists. I was very happy to get all this running and all 20,000+ of our user's accounts transferred over from our old Cyrus 1.5.x servers. Thanks to everyone who has ever posted to this list, it was an invaluable resource! Unfortunately since the upgrade, on the more heavily used employee box (1,300 users, 200+ simultaneous active), we have been experiencing the problem described in a previous posting. Twice in the two weeks we have been up, it appears that an IMAP process locks up a mailbox. Subsequent attempts to access the mailbox lock up additional IMAP processes and then lmtpd processes attempting to deliver to this mailbox get hung up. Eventually if we are not paying attention, we block 10 lmtpd processes and then mail delivery stops since I set the maximum local concurrent deliveries to 10 in qmail. The workarounds recommended below seem to work (Thanks for posting this to the list), but I was wondering if anyone has any ideas as to what is the root cause of this problem? I tried to search the archive, but was unable to find the complete thread of the discussion referenced here. Can someone point me to some relevant keywords or a date range to take a look at? Any help would be most appreciated, Thanks, John Wade "John C. Amodeo" wrote: > Greetings, > > A while back there was some discussion on the list about Cyrus imap > processes hanging or locking a users mailbox, and the only 2 ways to > rectify the problem was to: > > 1) Hunt down the PID that was waiting for an exclusive lock and kill it > (thus unfreezing the lock and allowing other processes such as LMTP to > deliver messages to the mailbox or... > 2) Restart the Cyrus master process, which also fixes the problem, but > kicks everyone of the server. > > This is actually happening as frequently as every few weeks for us, > where a user will try to delete a message, or copy a message from the > Inbox to a sub folder, and everything locks. Then, because of the lock, > Postfix (or Sendmail) have to defer mail because neither can deliver > through LMTP. This problem will not go away until the sysadmin is > notified and has a chance to log into the server and kill the IMAP Pids > for this user. > > I wanted to share with everyone two (somewhat dirty) scripts we've > hacked up to make our lives easier. If anyone wished to spice these up > and modify them for the better, please do. > > Script 1: "kill_proc" - Simple Shell Script for use on the local Unix > file system by the Root user > -- > #!/bin/sh > # > # Kill all Cyrus IMAP Pids for a particular user on the server > # Use: sh kill_proc <path> <username> > # Example: [root@server]# sh kill_proc /var/imap/proc joeblow > > if pushd $1; then > kill `grep $2 * | cut -f1 -d":"`; > rm -i `grep $2 * | cut -f1 -d":"`; > echo "IMAP Pids Killed"; > popd; > else > echo "Bad Directory"; > fi > -- > > Script 2: "kill_proc.cgi" - Web-based version, written in Python, that > is designed so if the System Administrator is not around, a user can > kill their own Pids by going to a web page. This works by acquiring the > IP address of the web browser the user is coming in from and using this > address to grep the /var/imap/proc directory and kill any Pids that are > associated with the IP Address. This file needs to be placed in your > cgi-bin directory, with the following permissions: -rwsr-x--- cyrus > apache (apache could also be 'nobody') Not 100% secure, but the most > damage that could be done is someone spoofing IP addresses and killing > Imap processes that don't belong to them. I am thinking of adding some > sort of Imap authentication for this, but haven't had a chance... > -- > #!/usr/bin/python > import os, sys, string > print "Content-Type: text/html\n\n" > > ## Variables To Change: > ## apacheid = uid of the web user - usually nobody or apache > ## path = path to the Cyrus proc directory > ## os.setuid() = towards bottom of the script - set to the uid of the > Cyrus user (90?) > > apacheid = 48 > path = "/var/imap/proc/*" > > def test(): > try: > ipaddr = os.environ['REMOTE_ADDR'] > if (apacheid != os.getuid()): > print "This can only be accessed via the Web" > sys.exit() > except: > print "There was an error... exiting" > sys.exit() > > def filelist(command): > try: > grep_out = os.popen(command, 'r') > flist = [] > for line in grep_out.readlines(): > words = string.split(line, ':') > flist.append(words[0]) > return flist > except: > print "<br>Unable to find any locks<br>" > sys.exit() > > test() > grepcommand = "grep -r \'" + os.environ['REMOTE_ADDR'] + "\>\' " + path > # Set to Cyrus UID > os.setuid(90) > > pidfilelist = filelist(grepcommand) > for pidfile in pidfilelist: > words = string.split(pidfile, '/') > pid = int(words[-1]) > try: > os.kill(pid, 15) > os.remove(pidfile) > except: > print "<br> Error trying to remove ", pidfile, " and kill ", pid > sys.exit() > > html = """ > <h4>Mailbox unlocked. Please login to server to verify your account is > unlocked.</h4> > """ > print html > -- > > Hopefully these scripts can make someones life a little eaiser, so I > figured it would be a good idea to post them. > > -John