We have experienced the same problem with cyrus/IMP regarding the lockers 4 days ago. We solved the problem by restarting Cyrus, and haven't seen it since. I think the problem may lie, actually, in lmtp, because our sessions continued to hang with the locking error until we stopped and restarted Cyrus.
We have not found a solution, and have not seen this problem since. However, if someone finds the cause/solution, we would be interested. We are running on linux 2.4.14-xfs, cyrus-2.1.0, IMP 3.0/Horde 2.0/Apache 1.3.22 Hope this helps. B Chris Peck wrote: > > We have noticed a serious load-avg issue after upgrading to cyrus-2.0.16. > > We have just upgraded to cyrus-2.0.16 over the weekend, we went from > Simeon/Messaging Direct (cyrus-1.x version) under Solaris 2.6 to > cyrus-2.0.16 under Solaris 8 (kernel patch 108258-10). We tested it as > much as possible for about 2 months before transferring all of our > mailstore over (around 18000 users). We were able to test all of our > e-mail clients except for our webmail client (IMP-2.x). I must say that > the new version/system seemed to be great (and other than the following > issue, I still think it's great, especially the sieve capability). > anyways, here's our issue: > > We brought the system back up Sunday afternoon, and up until Monday > afternoon everything ran just fine (when we "fixed" IMP). We had an > issue with IMP not logging in correctly, but fixed that by flushing the > session info IMP uses from the mysql database it stored it in. Once IMP > was up & being used we started to notice the following: > > Our normal load average is betweem 0.7 & 2 (around 2 when we get batches > of email in). We noticed the loadavg rising at a tremendous rate, soon > it was at 20, 30, 40, all the way up to 65! What was interesting was > that response time for me while logged in to ksh was fast, as were > connects to pop & imap. Sendmail, of course turned off receives (set to > stop at 8). After about an hour we decided to disable IMP (run's on a > separate box), and the load dropped immediately to under 1. Here's some > info I managed to collect during this scary period: > > System Info: > cyrus-2.0.16 > cyrus-sasl-1.5.24 > db-3.3.11 > sendmail-8.12.1 > > # cat config.status (for cyrus-imapd) > #! /bin/sh > # Generated automatically by configure. > # Run this file to recreate the current configuration. > # This directory was configured as follows, > # on host mail1: > # > # ./configure --with-dbdir=/usr/local/BerkeleyDB.3.3 --with- > perl=/usr/local/bin/perl --with-openssl=/usr/local/ssl --with- > tcl=/usr/local --with-libwrap=/usr/local --with-sasl=/usr/local/lib > --with-auth=unix --with-idle=poll > # > > Here's some info from the running system during hi-load time: > > top showed: > last pid: 933; load averages: 34.75, 30.96, > 29.70 15:17:59 > 185 processes: 131 sleeping, 50 running, 4 on cpu > CPU states: 4.7% idle, 83.6% user, 11.2% kernel, 0.5% iowait, 0.0% > swap > Memory: 4096M real, 3234M free, 180M swap in use, 3974M swap free > > PID USERNAME THR PRI NICE SIZE RES STATE TIME CPU COMMAND > 785 cyrus 1 55 0 24M 3824K run 0:19 1.86% imapd > 734 cyrus 1 56 0 24M 3768K run 0:18 1.84% imapd > (many more imapd's, pop3d's running...): > > root@mail1 # truss -p 785 > lwp_mutex_lock(0xFF351E68) = 0 > lwp_mutex_wakeup(0xFF351E68) = 0 > lwp_mutex_wakeup(0xFF351E68) = 0 > pread64(10, "\0\0\001\0 z10 D\0\0\0BF".., 8192, 1564672) = 8192 > lwp_mutex_wakeup(0xFF351E68) = 0 > lwp_mutex_lock(0xFF351F38) = 0 > lwp_mutex_wakeup(0xFF351F38) = 0 > pread64(10, "\0\0\001\0 z10 D\0\0\0BF".., 8192, 1564672) = 8192 > lwp_mutex_lock(0xFF351E68) = 0 > lwp_mutex_wakeup(0xFF351E68) = 0 > lwp_mutex_lock(0xFF351E68) = 0 > lwp_mutex_wakeup(0xFF351E68) = 0 > lwp_mutex_lock(0xFF351F38) = 0 > lwp_mutex_wakeup(0xFF351F38) = 0 > pread64(10, "\0\0\001\0 z10 D\0\0\0BF".., 8192, 1564672) = 8192 > > This went on "forever", until I stopped "master". > > We also noticed the following a little earlier (during the hi-load time) > from local6.error: > Jan 7 14:54:33 mail1 imapd[6]: [ID 866726 local6.error] DBERROR db3: 74 > lockersJan 7 14:54:34 mail1 imapd[30]: [ID 866726 local6.error] DBERROR > db3: 76 lockers > Jan 7 14:54:39 mail1 imapd[4]: [ID 866726 local6.error] DBERROR db3: > Unable to allocate 8287 bytes from mpool shared region: Not enough space > Jan 7 14:54:39 mail1 imapd[4]: [ID 598274 local6.error] DBERROR: error > advancing: Not enough space > Jan 7 14:54:40 mail1 imapd[29918]: [ID 866726 local6.error] DBERROR > db3: Unable to allocate 8287 bytes from mpool shared region: Not enough > space > Jan 7 14:54:40 mail1 imapd[29918]: [ID 598274 local6.error] DBERROR: > error advancing: Not enough space > Jan 7 14:54:40 mail1 imapd[153]: [ID 866726 local6.error] DBERROR db3: > Unable to allocate 8287 bytes from mpool shared region: Not enough space > Jan 7 14:54:40 mail1 imapd[153]: [ID 598274 local6.error] DBERROR: > error advancing: Not enough space > Jan 7 14:54:51 mail1 imapd[151]: [ID 866726 local6.error] DBERROR db3: > 86 lockers > > lots of the above "DBERROR db3: xx lockers" lines > > Jan 7 15:07:13 mail1 imapd[454]: [ID 898233 local6.error] DBERROR: > error closing: DB_INCOMPLETE: Cache flush > Jan 7 15:07:13 mail1 imapd[454]: [ID 179994 local6.error] DBERROR: > error closing mailboxes: cyrusdb error > > It seems that IMP was able to tickle something on the system, realize > that IMP does a login/logout for every IMAP transaction, so it is an > expensive webmail client. However, it shouldn't be able to put cyrus > into such a weird state (it is just a client). > > Any suggestions would be helpful.... > > Thanks, > chris > > Chris Peck > Manager of Unix Services > Information Technology > The College of William and Mary > Williamsburg, VA 23187 -- Robert Scussel 1024D/BAF70959/0036 B19E 86CE 181D 0912 5FCC 92D8 1EA1 BAF7 0959