Igor Brezac wrote:
On Wed, 2 Jun 2004, Shawn Sivy wrote:
Igor Brezac wrote:
On Tue, 1 Jun 2004, Shawn Sivy wrote:
I'm having all kinds of problems with Cyrus IMAP 2.2.5 on Solaris 9. System I/O errors, imap processing dying, IOERRORs.
Does anyone have suggestions on what could be the cause? Has anyone gotten version 2.2.5 working on Solaris (SPARC) 9?
-Shawn
May 30 17:52:57 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR: opening quota file /var/imap/quota/m/user.macey2: Too many open files May 30 17:52:57 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR: error fetching user.macey2: cyrusdb error May 30 17:52:57 cyrus lmtpunix[14954]: [ID 860734 local6.debug] verify_user(user.macey2) failed: System I/O error May 30 17:53:20 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR: opening quota file /var/imap/quota/s/user.sdhugg: Too many open files May 30 17:53:20 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR: error fetching user.sdhugg: cyrusdb error May 30 17:53:20 cyrus lmtpunix[14954]: [ID 860734 local6.debug] verify_user(user.sdhugg) failed: System I/O error May 30 17:53:40 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR: opening quota file /var/imap/quota/s/user.samuel2: Too many open files May 30 17:53:40 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR: error fetching user.samuel2: cyrusdb error May 30 17:53:40 cyrus lmtpunix[14954]: [ID 860734 local6.debug] verify_user(user.samuel2) failed: System I/O error May 30 17:54:05 cyrus lmtpunix[14954]: [ID 240394 local6.error] IOERROR: opening quota file /var/imap/quota/b/user.balaisi2: Too many open files May 30 17:54:05 cyrus lmtpunix[14954]: [ID 335833 local6.error] DBERROR: error fetching user.balaisi2: cyrusdb error May 30 17:54:05 cyrus lmtpunix[14954]: [ID 860734 local6.debug] verify_user(user.balaisi2) failed: System I/O error
Jun 1 08:42:19 cyrus master[21185]: [ID 970914 local6.error] process 21886 exited, signaled to death by 11 Jun 1 08:43:26 cyrus master[21185]: [ID 970914 local6.error] process 20660 exited, signaled to death by 11 Jun 1 08:43:43 cyrus master[21185]: [ID 970914 local6.error] process 20133 exited, signaled to death by 11 Jun 1 08:47:02 cyrus master[21185]: [ID 970914 local6.error] process 23236 exited, signaled to death by 11 Jun 1 08:47:20 cyrus master[21185]: [ID 970914 local6.error] process 23972 exited, signaled to death by 11 Jun 1 08:47:58 cyrus master[21185]: [ID 970914 local6.error] process 23751 exited, signaled to death by 11 Jun 1 08:48:05 cyrus master[21185]: [ID 970914 local6.error] process 21258 exited, signaled to death by 11 Jun 1 08:49:53 cyrus master[21185]: [ID 970914 local6.error] process 19939 exited, signaled to death by 11 Jun 1 08:51:27 cyrus master[21185]: [ID 970914 local6.error] process 24807 exited, signaled to death by 11 Jun 1 08:51:37 cyrus master[21185]: [ID 970914 local6.error] process 23457 exited, signaled to death by 11
This looks like a berkeley db problem, although it could be a file descriptor leak somewhere. Have you applied sleepycat 4.2.52 patches (there are two of them, although the first is more important)? Does checkpointing of the cyrus databases complete successfully (look for ctl_cyrusdb in the syslog) I start master from 'configdirectory', otherwise berkeley checkpointing does not work (neither does duplicate db expiration)
I have both patches installed for db 4.2.52. Below are the messages from the log regarding ctl_cyrusdb. Looks like it completed fine. I took your suggestion of starting master from /var/imap.
Jun 2 08:47:05 cyrus master[17927]: [ID 392559 local6.debug] about to exec /local/cyrus/bin/ctl_cyrusdb Jun 2 08:47:06 cyrus ctl_cyrusdb[17927]: [ID 702911 local6.notice] recovering cyrus databases Jun 2 08:47:09 cyrus ctl_cyrusdb[17927]: [ID 275131 local6.notice] skiplist: recovered /var/imap/mailboxes.db (85526 records, 6516904 bytes) in 3 seconds Jun 2 08:47:13 cyrus ctl_cyrusdb[17927]: [ID 127214 local6.notice] done recovering cyrus databases Jun 2 08:47:13 cyrus master[17935]: [ID 392559 local6.debug] about to exec /local/cyrus/bin/ctl_cyrusdb Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 702911 local6.notice] checkpointing cyrus databases Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 578205 local6.debug] archiving database file: /var/imap/mailboxes.db Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 448116 local6.debug] archiving log file: /var/imap/db/log.0000000008 Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 578205 local6.debug] archiving database file: /var/imap/annotations.db Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 448116 local6.debug] archiving log file: /var/imap/db/log.0000000008 Jun 2 08:47:13 cyrus ctl_cyrusdb[17935]: [ID 127214 local6.notice] done checkpointing cyrus databases
Have things improved since you restarted master?
You can use pfiles and pmap (and lsof) to check for open files and memory usage. Try to use pfile against a running imapd process and see if a number of open files increases.
What does ulimit -a say?
cyrus# ulimit -a time(seconds) unlimited file(blocks) unlimited data(kbytes) unlimited stack(kbytes) 8192 coredump(blocks) unlimited nofiles(descriptors) 1024 vmemory(kbytes) unlimited
I played around with using skiplist for quota_db last night, but couldn't get it to work, setting/getting quotas just hung (cyradm). After I moved back to quotalegacy and copied back the previous quota files, I haven't seen the System I/O errors to "Too many files" message since, however, if it is a descriptor leak as Ken suggested, it may take a while to show itself.
I ran pmap and pfiles against an imapd and lmptd process. I'm not sure exactly how to interpret the output, but nothing seemed excessive. The one thing I notice is that the rlimit of the process is 256 eventhough the system-wide default limit ... not the max (set in /etc/system) is 1024. I though I saw that that 256 was a 32-bit app limit (at least on Solaris), but I'm not sure. The cyrus code is currently all compiled at 32-bit, not 64-bit.
Are your running version 2.2.5 or an earlier version (like 2.2.3) of imap? I may move back to 2.2.3 later today at Ken's suggestion.
-Shawn
--- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html