FastMail.FM GUID upgrade process
Ok - we're doing our GUID upgrades across the board now. Here's the process we're using: a) wrote a tool that tied a "DB_File" called "cyrus.sha1s" in each meta directory on the replicas, parsed the index files looking for records with nothing but zeros in the last 16 characters of the GUID and calculated the sha1 on each of them. This took about 5 days to finish running, but makes the next part a lot quicker! b) wrote a daemon which runs on each host and allows the following 4 commands: *) LOCK user.name.URI%20Escaped.MailboxName - lock cyrus.header, cyrus.index, cyrus.expunge in that order using fcntl (our cyrus is build with it). Also if cyrus.sha1s is missing, attempt to fetch it from the replica (but it's OK if this fails, just means all old GUIDs will cause a re-calculation) *) UPGRADE - parse each of cyrus.index and cyrus.expunge. If any old-style GUIDs are found, then looks first in cyrus.sha1s and finally just re-calculates from the underlying message files. - if any index records need new GUIDs or the old index has not yet been upgraded to version 10, stream the index file thorugh a Cyrus::IndexFile->stream_copy, altering the necessary GUIDs and forcing the output format to version 10 (this module can also be used to downgrade if we ever need to!) - leave the new file in cyrus.$item.NEW, but mark internally that the file has been upgraded. *) ROLLBACK - if any file has been upgraded, unlink() the .NEW file. - unlock expunge, index, header (in that order) *) COMMIT - if any file has been upgraded, rename() the .NEW to the base filename. - unlock expunge, index, header (in that order) c) wrote a controller script which reads the mailbox listing from the master and opens connections to both the master and replica slotd, sending the following commands: 1) master LOCK mailbox (or die) 2) replica LOCK mailbox (or master ROLLBACK; die) 3) master UPGRADE (or replica ROLLBACK; master ROLLBACK; die) 4) replica UPGRADE (or replica ROLLBACK; master ROLLBACK; die) 5) replica COMMIT (or replica ROLLBACK; master ROLLBACK; die) 6) master COMMIT (or master ROLLBACK; die NOISILY!!!) The only danger point is (6), where you could wind up with an upgraded replica without the associated upgraded master. You can go ahead and fix them by hand though, assuming you read the NOISILY bit. d) I think the slashdot crowd would put "Profit!!!" here. With only a short lock time on each index (most sha1s precalculated) and no need to multi-rewrite any index file, this will run much faster than the alternatives. I guess I should go clean up the cyrus.sha1s files once it's all finished. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus IMAPd 2.3.10 Released
On Sun, Nov 04, 2007 at 07:19:26PM -0800, Rich Wales wrote: > What is the current status of 2.3.10? Right after it was announced > a couple of weeks ago, I saw some people reporting problems. Are > there any patches? Or is 2.3.10 still believed to be OK as is? > > I'm running 2.3.9 on a FreeBSD 6.2 master and an Ubuntu 7.10 replica > server setup, and I want to upgrade to 2.3.10 in hopes of getting > rid of some problems with the sync code intermittently crashing, but > this is a production system, and I don't feel comfortable upgrading > to 2.3.10 as long as there are unresolved serious bug reports. FastMail is running 2.3.10 on all our production systems now, and there are no regressions that I'm aware of. There are still some bugs that also existed in 2.3.9, but we aren't patching against any "bugs" rather than "things that don't work how we would prefer". The two big "new" bugs that we found in the 2.3.10pre3 codebase when we rolled that out into production had their fixes accepted back upstream before the final 2.3.10 was cut. That said, we're seeing slightly more skiplist corruption than previously and have not yet determined the cause. We're backing up a text-dump of our mailboxes.db files once per hour just in case - but it's highly probable that the issues are due to something odd we're doing with long running cyr_dbtool processes. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Deleting top-level mailbox with 'delete_mode: delayed'
On Fri, Nov 02, 2007 at 01:15:37PM -0400, Brian Wong wrote: > On Nov 2, 2007 12:39 PM, Rudy Gevaert <[EMAIL PROTECTED]> wrote: > > > > Brian Wong wrote: > > > I was testing out Cyrus 2.3.10 and realized that when I set the option > > > > > > delete_mode: delayed > > > > > > I can not delete top-level mailboxes. > > > > > > localhost.localdomain> lm > > > localhost.localdomain> cm user.bwong > > > localhost.localdomain> sam user.bwong c > > > localhost.localdomain> dm user.bwong > > > deletemailbox: Operation is not supported on mailbox > > > localhost.localdomain> lm > > > user.bwong (\HasNoChildren) > > > > > > Disabling the delayed delete gives expected results. The mailbox is > > > deleted as normal. Anyone else confirm this? > > > > I'm just back from holiday (and only catching up on mail). I always set > > the 'x' permission. Could you try that? If that doesn't work, I'll try > > to delete a top level mailbox on Monday (I'm running 2.3.10 in test). > > > > Rudy > > > > localhost.localdomain> lam user.bwong > bwong lrswipkxtecda > admin kxc > localhost.localdomain> dm user.bwong > deletemailbox: Operation is not supported on mailbox > > I think if I did not have the right to delete the mailbox, I would get > a "Permission Denied" instead of the error I am receiving. Let me know > what you find when you try it. I feel that if this is really a bug it > would have been caught before release, but then again I can't think of > anything atypical with my setup that would cause this problem. It's almost certainly caused by the code that checks if you're renaming a "top level mailbox" for a user and special cases it in all sorts of ways. I never liked that code much! My solution was to make DELETED.user.bwong.46A12345 (or similar) also be considered to be an "INBOX" so it was treated as a user rename. This seems not to be working in your environment, and I'm really not sure why. I don't see anything specially different in our config. fast_rename: yes, but that won't work for you anyway because it's using a not-yet-perfect patch at our end. All our mailbox deletes are done as the admin user. It won't work if you're not a bona-fide admin (not just a user called admin who happens to have an ACL). Check the 'admins:' parameter in your imapd.conf. Regards, Bron. (P.S. your username is scarily similar to mine!) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus IMAPd 2.3.10 Released
On Tue, Nov 06, 2007 at 08:57:29AM +0100, Rudy Gevaert wrote: > Bron Gondwana wrote: >> On Sun, Nov 04, 2007 at 07:19:26PM -0800, Rich Wales wrote: >>> What is the current status of 2.3.10? Right after it was announced >>> a couple of weeks ago, I saw some people reporting problems. Are >>> there any patches? Or is 2.3.10 still believed to be OK as is? >>> >>> I'm running 2.3.9 on a FreeBSD 6.2 master and an Ubuntu 7.10 replica >>> server setup, and I want to upgrade to 2.3.10 in hopes of getting >>> rid of some problems with the sync code intermittently crashing, but >>> this is a production system, and I don't feel comfortable upgrading >>> to 2.3.10 as long as there are unresolved serious bug reports. >> FastMail is running 2.3.10 on all our production systems now, and >> there are no regressions that I'm aware of. There are still some >> bugs that also existed in 2.3.9, but we aren't patching against >> any "bugs" rather than "things that don't work how we would prefer". > > I have upgraded three (of seven) mailstores yesterday. Upgrade went > without problems. Regeneration of the guid takes a long time. It's not > finished yet, and its been running for 12 hours. That's why I pre-calculated all the sha1s I could and wrote my own index upgrader :) Yeah, it takes a while! Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Thu, Nov 08, 2007 at 10:18:04AM -0800, Vincent Fox wrote: > Our latest line of investigation goes back to the Fastmail suggestion, > simply > have multiple Cyrus binary instances on a system. Each running it's own > config and with it's own ZFS filesystems out of the pool to use. > Since we can bring up a virtual interface for each instance we won't even > have to bother with using separate port numbers, etc. Also virtual interfaces means you can move an instance without having to tell anyone else about it (but it sounds like you're going with an "all eggs in one basket" approach anyway) Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Fri, Nov 09, 2007 at 01:28:05PM -0500, John Madden wrote: > On Fri, 2007-11-09 at 19:10 +0100, Jure Pečar wrote: > > I'm still on linux and was thinking a lot about trying out solaris 10, > > but > > stories like yours will make me think again about that ... > > Agreed -- with the things I see from the Solaris (and zfs) and Sparc > hardware in general, my money's still on Linux/LVM/Reiser/ext3. > > 250,000 mailboxes, 1,000 concurrent users, 60 million emails, 500k > deliveries/day. For us, backups are the worst thing, followed by > reiserfs's use of BLK, followed by the need to use a ton of disks to > keep up with the i/o. For us backups are hardly a blip on the radar :) The joy of writing your own custom backup system that knows more about Cyrus internals than just about anything else. It starts with some stat calls, and if any of the cyrus.header, cyrus.index or cyrus.expunge files have changed then it will lock them all then stream them all to the backup server. The backup server then parses them and decides (based on GUID) if there are any data files it hasn't yet fetched. If so, it fetches them and checks the sha1 of the fetched file against the GUID. The whole thing takes a couple of seconds per user and requires less IO than even using direct IMAP calls would. Now our big IO user is cyr_expire. We run it once per week, and it's a killer. I'd be tempted to run it a lot more frequently if it didn't have such a high baseline IO cost on top of the actual message unlinks (though the unlinks are the real killer) Bron ( and the BKL, *sigh*. I just installed an external RAID unit with 8x1TB drives in it for data. That 6GB/300Mb == 20 data partitions plus 20 meta partitions to go with it. That's a lot of BKL! ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: How many people to admin a Cyrus system?
On Fri, Nov 09, 2007 at 08:24:00AM -0500, Adam Tauno Williams wrote: > > > How many > > > and what sort of people does it take to maintain a system such as > > > this? I need a good argument for hiring a replacement for me. > > At a minimum you want 1 qualified person and someone cross-trained > > as a backup, so that person can reasonably enough have vacations. > > Any decent sysadmin should be able to MAINTAIN such a service > > I don't think actually programming skills should be primary. > > Agree. I maintain a Cyrus system. And on most days that doesn't even > involve touching it. Any reasonably proficient person with UNIX skills > should be able to take over Cyrus administration given they are willing > to do some reading. I maintain a Cyrus system and it's taken over my life! Yikes. Summary for this week: * crc32 for indexes hummning along in the background. * getting more skiplist corruptions - going to have to post about this. We lost 4 mailboxes.db files this week, 3 during controlled failover, one during the night when nobody was working on it. Suspect new ACL feature on the web interface which allows more frequent updates is causing issues. * the one buggy skiplist I actually still have a copy of, the "logstart" value in the header is wrong, causing recovery to fail with only a few of the records still reachable because it hits another INORDER record rather than the ADD record and drops out. I've got the monitoring system set up to let me know if it finds any other skiplist errors and take a copy of the offending file. > > I have been > > doing sysadmin work since 1989 and the actual programming work I've > > done in that time has been maybe 2% of it. If you have a lot of custom > > interface stuff to your campus systems maybe you need more programmer > > skills. As a completely inappropriate generalization, former engineers > > and mathematicians also make good sysadmins because they have the mindset > > and the skills for problem decomposition and trouble-shooting. > > Yep. Agree there. Sysadmin has always been a fraction of my work because I tend to do a lot of "glue" programming to abstract away anything that's sysadmin work. My first really major project (after converting us from CVS to Subversion) was making all the servers install automatically from PXE boot and the configurations all set themselves up with "make install" from the Subversion repository, so that most everyday sysadmin is now automated - just update the master config file, roll it out, restart affected services. So day to day we need less than one sysadmin, but of course incident response is unpredicatable, and having two good sysadmins (Rob and I share sysadmin responsibility here) available is very handy. Both for the "you can let one of them have a holiday" point of view and the "two heads are better than one" ability to work past the other's mental blocks and avoid getting stuck in a rut trying to solve problems. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Skiplist stuff
On Sun, Nov 11, 2007 at 02:31:45PM +1100, Bron Gondwana wrote: > * the one buggy skiplist I actually still have a copy of, the "logstart" > value in the header is wrong, causing recovery to fail with only a few > of the records still reachable because it hits another INORDER record > rather than the ADD record and drops out. I've got the monitoring > system set up to let me know if it finds any other skiplist errors and > take a copy of the offending file. Ok, so attached is the patch that deals with this one particular case and adds a bit more debugging as well. It will log an ERROR if it sees the case that would previously cause a total bailout. NOTE - there's still heaps of "add this value read from file to this mmap base pointer and dereference with impunity" scattered through this code, which is the sort of accident waiting to happen I've been trying to remove from everything else along with the CRC32 stuff I keep promising. I also wrote a much bigger patch which you don't get to see yet because someone might be silly enough to try and use it. It does some variable renaming, refactoring, etc. At least there's some signed vs unsigned fuzziness I'm still not happy with here, and a naming policy that made it clearer when a variable contained network-byte-order and when it contained host-byte-order would be peachy too. My big patch does that. Sadly it also infinite loops during recovery, which suggests I did something pretty stupid in it somewhere! Too late to debug that tonight. Anyway - here it is. A "recovery()" that copes if the logstart parameter in the database header is wrong. No, I don't have a clue how that happened unless lseek() lied. Maybe it sometimes lies, I don't know. I'll be writing a test case for that soon too! Bron. Index: cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c === --- cyrus-imapd-2.3.10.orig/lib/cyrusdb_skiplist.c 2007-11-11 07:44:25.0 -0500 +++ cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c 2007-11-11 07:59:47.0 -0500 @@ -1206,7 +1206,7 @@ lseek(tp->syncfd, tp->logend, SEEK_SET); r = retry_writev(tp->syncfd, iov, num_iov); if (r < 0) { - syslog(LOG_ERR, "DBERROR: retry_writev(): %m"); + syslog(LOG_ERR, "DBERROR: retry_writev() %s: %m", db->fname); myabort(db, tp); return CYRUSDB_IOERROR; } @@ -1926,20 +1926,13 @@ /* reset the data that was written INORDER by the last checkpoint */ offset = DUMMY_OFFSET(db) + DUMMY_SIZE(db); -while (!r && (offset < (bit32) db->logstart)) { +while (!r && (offset < db->map_size) + && (TYPE(db->map_base+offset) == INORDER)) { ptr = db->map_base + offset; offsetnet = htonl(offset); db->listsize++; - /* make sure this is INORDER */ - if (TYPE(ptr) != INORDER) { - syslog(LOG_ERR, "DBERROR: skiplist recovery: %04X should be INORDER", - offset); - r = CYRUSDB_IOERROR; - continue; - } - /* xxx check \0 fill on key */ /* xxx check \0 fill on data */ @@ -1980,6 +1973,11 @@ } } +if (offset != db->logstart) { + syslog(LOG_ERR, "DBERROR: recovery logstart %s: %04X not %04X", + db->fname, offset, db->logstart); +} + /* zero out the remaining pointers */ if (!r) { for (i = 0; !r && i < db->maxlevel; i++) { Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication: does it work in both directions?
On Sat, Nov 10, 2007 at 07:51:15PM -0800, Rich Wales wrote: > I'm using replication on a 2.3.9 system. > > I know that if changes happen on the master system, they are propagated > automatically to the replica system. > > But what happens if I make a change on the replica (e.g., by setting up > an account to access its mail through the replica's IMAP server)? I > tried this just now, and the change is NOT propagating from the replica > to the master. > > What do I need to do in order for changes made on the replica to get > copied over to the master? Or is this simply impossible? Impossible. You don't do this. What you can do (the simple case of what we do) is set up two Cyrus instances on each machine, replicating to each other, and set up user accounts on one or the other, so you can get full use of both machines. Regards, Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication: sync_client -r dies
On Sat, Nov 10, 2007 at 07:09:53PM -0800, Rich Wales wrote: > After about a week of having synchronization running perfectly in my > 2.3.9 system, I finally got another bailout incident with sync_client > on my master server. > > This happened just after I shut down my replica server (to move it to > a different location). About two minutes after the replica went down, > sync_client on the master said "Error in do_sync(): bailing out!" with > no other messages of any kind. > > It seems to me that the replication code ought to be a bit more robust > than this when a replica goes down or loses network connectivity. Is > the 2.3.10 code any better than 2.3.9 in the way this kind of situation > is handled? I believe David Carter has been working on some stuff for this which is lined up to go in soon. We just have a monitor_sync script that runs every 10 minutes from cron and can recover from this and a variety of other interesting situations. Yeah - it would be nice to have a way to tell the master "going down now, be back later". Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Multiple skiplist bugs found, patches attached
On Mon, Nov 12, 2007 at 12:34:34AM +1100, Bron Gondwana wrote: > Anyway - here it is. A "recovery()" that copes if the logstart > parameter in the database header is wrong. No, I don't have a > clue how that happened unless lseek() lied. Maybe it sometimes > lies, I don't know. I'll be writing a test case for that soon > too! I have some more suspicions now, but I wrote it all up in the patch header, so here's the bugfixes only patch, a "robustness" extras patch and the tool I used for testing. Ken, I know you've done some other work on the file changing types. I'd like to be even more agressive and convert just about everything to bit32 and also rename some variables, but I restricted myself in this to only fixing the most ugly case: offset = htonl(offset). These patches are all against 2.3.10 (in this order), and may need some fuzz fixing to apply against your latest CVS thanks to those changes - sorry I haven't done that, but it's getting on 1am for me, and I've just finished doing a lot of testing and paring these down to simple and clear patches that don't touch more than they need to fix the issues. cyrus-skiplist-bugfixes-2.3.10.diff: Ken, please review this patch and consider it for pushing out in the next release, preferably soon. There really are a lot of issues I found reviewing the code, and even more so with the attached tool that can be used to "hammer" a file with all sorts of interesting requests. There are some really nasty skiplist corruptions available if even one process is ever killed or segfaults half way through a transaction and the first operation to touch said file after this is a write. cyrus-skiplist-robustify-2.3.10.diff: If you have been running skiplist on your systems and suspect that you may have corrupted databases, you probably want this. It adds extra robustness and fixupability to recovery(). It's still not going to be crash proof reading line-noise, but it detects and fixes all the corruptions I have personally seen. (I say this having tested it on all the bogus DBs I had, eg:) DBERROR: skiplist recovery /tmp/mb2/mailboxes.db.1194508562: -> 8E6DD8 should be ADD or DELETE became: skiplist recovery /tmp/mb2/mailboxes.db.1194508562: -> skipped 136 bytes of zeros at 8E6DD8 skiplist: recovered /tmp/mb2/mailboxes.db.1194508562 -> (44594 records, 9547108 bytes) in 1 second skiplist: checkpointed /tmp/mb2/mailboxes.db.1194508562 -> (44594 records, 5350556 bytes) in 1 second and: skiplist recovery /tmp/mailboxes.db.fail: -> 288C00 should be ADD or DELETE became: skiplist recovery /tmp/mailboxes.db.fail: -> incorrect logstart 288BD8 changed to 356F94 skiplist: recovered /tmp/mailboxes.db.fail -> (28811 records, 4109664 bytes) in 1 second In both cases a second run of the same command (I used cyr_dbtool 'show') came up clean - no issues remaining and no log entries. cyrus-hammer-skiplist-2.3.10.diff: I used this command to hammer a skiplist database like so: sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db & sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db & sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db & sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db & sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db & sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db & sudo -u cyrus ./hammer_skiplist -n /tmp/hammer.db & ... I've turned down the "forget about this transaction" option to a lot less common than my original tests. It should still fire a couple of times per hammer, but it creates log entries even on the fully patched code (rolling back incomplete txn), so I didn't want to spam the logs. Enjoy, Bron. SKIPLIST bugfixes In the past we have had issues with bugs in skiplist on seen files, and we truncated files at the offset with the issue since they were only seen data. Lately, we have had more tools updating mailboxes.db more often, and have lost multiple mailboxes.db files. There are two detectable issues: 1) incorrect header "logstart" values causing recovery to fail with either unexpected ADD/DELETE records or unexpected INORDER records depending which side of the correct location the logstart value is wrong. 2) a bunch of zero bytes between transactions in the log section. The attached patch fixes the following issues: a) recovery failed to update db->map_base if it truncated a partial transaction. This reliably recreated the zero bytes issues above by having the next store command lseek to a location past the new end of the file, and hence fill the remainder with blanks. b) the logic in the "delete" handler for detecting "no record exists" (ptr == db->map_base) was backwards, meaning t
Re: Replication: does it work in both directions?
On Sun, Nov 11, 2007 at 08:41:04PM -0800, Rich Wales wrote: > Earlier, I wrote: > > >> What do I need to do in order for changes made on the replica > >> to get copied over to the master? > > Bron Gondwana replied: > > > Impossible. You don't do this. What you can do (the simple > > case of what we do) is set up two Cyrus instances on each > > machine, replicating to each other, and set up user accounts > > on one or the other, so you can get full use of both machines. > > I note that sync_client can take a list of mailboxes on the command > line. Does this define (and limit) the set of mailboxes that are > replicated? If a mailbox is listed in the command line, are sub- > mailboxes replicated too? It doesn't work like that. Rolling replication gets events from actions on mailboxes (lmtp deliver, imapd updates, etc) and logs them - then the sync_client process running in the background reads that log file and uses the actions to know what things to check and sync with the sync_server on your replica. > My environment (family network) only has half a dozen users, and the > set of users changes only rarely. Suppose I do the following: > > (1) I divide my users into two groups -- each group assigned to one > of my two Cyrus servers as the master for those users. > > (2) The sync_client line in cyrus.conf for each server lists the > mailboxes for the users assigned to that server as master. Each > user is listed in the sync_client command line of only one server. > > (3) Each server is configured (via the sync_... lines in imapd.conf) > to sync to the other server. > > (4) Both servers would be running sync_server. > > So, I would have replication set up going both directions between my > two servers, but the sets of users handled in each direction would be > disjoint. Each user would be assigned to one IMAP server (the master > for their mailbox collection), and the other server would be their > replica and act as their backup. > > Would this work? You are evil. While I can't see any particular reason why it wouldn't, I'm still scared. I wouldn't be game to mess with that. You'd really REALLY want to be sure that your email delivery and IMAP connections only happened to the approved master for each user or you'd get a bad case of split brain. > Remember, again, that I'm talking about a small installation. Clearly, > a scheme requiring every user's mailbox to be explicitly listed in one > or the other server's sync_client line is not going to scale to a large > setup with hundreds or thousands of users; I understand this. > > If this idea of doing two-way partial replication with a single Cyrus > instance on each server will in fact work, should I use the same value > for sync_machineid on both servers? Or should they be different? If you use use 2.3.10 then it really doesn't matter at all. It's a relic of worser UUIDs (now sha1 based GUIDs) that nobody wants to talk about ever again. That said, the code probably still requires you set it, and I'd set them differently just for the "why not" factor. Maybe something deep in the code still cares and I can't be arsed checking right now, I've been reading the skiplist code for days, and I'm sure it will give me nightmares when I calm down enough to sleep! Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Renaming heirarchies from the bottom
On Mon, Nov 12, 2007 at 11:38:54AM +, Ian G Batten wrote: > A common scenario when we are moving users between partitions is to > want to move their archived mail first, because there's less risk of > disrupting them, and then their top-level mailbox last. For a different approach - we use sync_client and some hacked together config directory to replicate all their email as close as possible to up-to-date, then lock down all incoming connections and run it again. Finally, we check that all the data got there OK (we have a tool for comparing two copies via IMAP) and then update the delivery paths and proxies and up them come again. Any sort of rename where they get presented with their folders slowly disappearing sounds fraught with confusion to me! (renaming of INBOX is special in lots of ways) That said, no comments on the general merits of your approach. The important thing is to check if the data transferred correctly :) Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication: does it work in both directions?
On Mon, 12 Nov 2007 09:37:12 -0800, "Rich Wales" <[EMAIL PROTECTED]> said: > Bron Gondwana wrote: > > > It doesn't work like that. Rolling replication gets events from > > actions on mailboxes (lmtp deliver, imapd updates, etc) and logs > > them - then the sync_client process running in the background > > reads that log file and uses the actions to know what things to > > check and sync with the sync_server on your replica. > > OK, that's the answer I needed to hear. If a mailbox list on the > command line is not compatible with rolling replication, then I'll > simply not have a choice but to set up two Cyrus instances if I > want to spread my users across different IMAP servers. It works fine as a one-off, but not for rolling, because rolling reads the log. That said, only users who have had any actions on that server will create log entries. I am quite tempted to test your theoretical layout at some point, but right now I'm heartily sick of playing with Cyrus and am going to take a break and do something totally different. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Multiple skiplist bugs found, patches attached
On Tue, 13 Nov 2007 09:12:18 +0100 (CET), "Simon Matter" <[EMAIL PROTECTED]> said: > > On Mon, Nov 12, 2007 at 12:34:34AM +1100, Bron Gondwana wrote: > >> Anyway - here it is. A "recovery()" that copes if the logstart > >> parameter in the database header is wrong. No, I don't have a > >> clue how that happened unless lseek() lied. Maybe it sometimes > >> lies, I don't know. I'll be writing a test case for that soon > >> too! > > > > I have some more suspicions now, but I wrote it all up in the > > patch header, so here's the bugfixes only patch, a "robustness" > > extras patch and the tool I used for testing. > > > > Ken, I know you've done some other work on the file changing > > types. I'd like to be even more agressive and convert just > > about everything to bit32 and also rename some variables, but > > I restricted myself in this to only fixing the most ugly case: > > offset = htonl(offset). > > > > These patches are all against 2.3.10 (in this order), and may > > need some fuzz fixing to apply against your latest CVS thanks > > to those changes - sorry I haven't done that, but it's getting > > on 1am for me, and I've just finished doing a lot of testing > > and paring these down to simple and clear patches that don't > > touch more than they need to fix the issues. > > > > > cyrus-skiplist-bugfixes-2.3.10.diff: > > > > > cyrus-skiplist-robustify-2.3.10.diff: > > > > Hi Bron, > > I didn't have much troubles with skiplist over the years and it has been > a > blessing since moving away from BDB. But I did have a few issues with > broken skiplist files so your patches are very welcome. I have included > the patches in my private rpm packages to try how they work. Do you > recommend both for general consumption? They've been running for 24 hours on all our production systems with no ill effects :) Seriously - yes, I do. They are quite short, and they're the culmination of about 3 days of pretty heavy work over the weekend and Monday after we lost a mailboxes.db on our busiest store to one of these bugs (my wife and kids were getting ready to kill me for neglecting them towards the end, I'm sure!) I build multiple different patches and tested them over that time. I also wrote a Perl module that can read skiplist files natively and tested some things with that as well. These couple of patches I have posted are the best bits of those distilled down into the simplest and clearest small set of changes. They've been hit pretty hard with the hammer scripts. I've also got another patch which I'll attach here that I wrote today which re-tunes the "how often to checkpoint" calculation. I want our mailboxes.db files especially to checkpoint more frequently, as that will make them less "seeky" - which will help with cachelines at least. We have enough memory (and always plenty free) that I'm sure every page is hot in cache within a few minutes. The seekyness is mainly an issue with clients doing "LIST", which our web interface does at login, so we want it to be as quick as possible. As for seen files - well, they tend to be small and frequently updated, so they'll just checkpoint about 4 times as often now. Will save a tiny bit of disk space but more interestingly reduce the memory footprint to keep them all in cache. Bron. -- Bron Gondwana [EMAIL PROTECTED] Skiplist tuning With random changes to a mailboxes.db file, it could be nearly 100% random seeks before it recompressed. A seen file would need to reach 16kb before even considering re-compressing, with a real data length of just a couple of hundred bytes. This patch reduces the limits to: 4kb overhead 120% rather than 200% of current "sorted" size. Index: cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c === --- cyrus-imapd-2.3.10.orig/lib/cyrusdb_skiplist.c 2007-11-12 23:53:34.0 -0500 +++ cyrus-imapd-2.3.10/lib/cyrusdb_skiplist.c 2007-11-12 23:57:38.0 -0500 @@ -302,7 +302,7 @@ SKIPLIST_VERSION = 1, SKIPLIST_VERSION_MINOR = 2, SKIPLIST_MAXLEVEL = 20, -SKIPLIST_MINREWRITE = 16834 /* don't rewrite logs smaller than this */ +SKIPLIST_MINREWRITE = 4096 /* don't rewrite logs smaller than this */ }; #define BIT32_MAX 4294967295U @@ -1392,8 +1392,8 @@ } done: -/* consider checkpointing */ -if (!r && tid->logend > (2 * db->logstart + SKIPLIST_MINREWRITE)) { +/* consider checkpointing (journal is 20% of data length) */ +if (!r && tid->logend > (12 * db->logstart / 10 + SKIPLIST_MINREWRITE)) { r = mycheckpoint(db, 1); } Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Deleting top-level mailbox with 'delete_mode: delayed'
On Tue, Nov 13, 2007 at 01:11:49PM +0100, Simon Matter wrote: > > expunge_mode: delayed > > delete_mode: delayed > > I've just tried a batch delete of mailboxes and hit the same wall. > > Mailbox deletion doesn't work anymore with 2.3.10 if "delete_mode: > delayed". If "delete_mode: immediate" it works, but with delayed I get > "deletemailbox: Operation is not supported on mailbox". > > Did I miss something? Does anybody have a patch? I have "delete_mode: immediate" on the replica and "delete_mode: delayed" on the master. It doesn't make any sense for the replica to do a delayed delete, as the master is already generating a "RENAME" event (well, two MAILBOX events actually, let's not get picky) with the old and new names for the mailboxes. The replica will be doing a rename rather than a delete in the most frequent case anyway. If you're unlucky and the two MAILBOXES calls get split up (probably some other event on the first mailbox from earlier being run after you've done the rename - don't you love concurrency) then it will issue a DELETE on the replica. If that causes a rename instead you will wind up with _TWO_ deleted folders with very similar names, one containing all the messages that were still on the replica and one containing all the messages in the copy on the master. Better just to re-copy those ones from the master in the unlucky case if you ask me. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Tue, Nov 13, 2007 at 10:24:22AM +, David Carter wrote: > On Sun, 11 Nov 2007, Bron Gondwana wrote: > > >> 250,000 mailboxes, 1,000 concurrent users, 60 million emails, 500k > >> deliveries/day. For us, backups are the worst thing, followed by > >> reiserfs's use of BLK, followed by the need to use a ton of disks to > >> keep up with the i/o. > > > > For us backups are hardly a blip on the radar :) The joy of writing > > your own custom backup system that knows more about Cyrus internals than > > just about anything else. It starts with some stat calls, and if any of > > the cyrus.header, cyrus.index or cyrus.expunge files have changed then > > it will lock them all then stream them all to the backup server. > > Cyrus is pretty ideal for fast incremental updates to a backup system: > hence replication. You shouldn't need to lock anything with delayed > expunge, delayed delete and fast rename in place. If you're planning to lift a consistent copy of a .index file, you need to lock it for the duration of reading it (read lock at least). Yeah - replication is one way to do it. We happen to read from the masters at the moment, but it would be pretty trivial to switch to using the replicas (change a $Store->MasterSlot() to $Store->ReplicaSlot() at one place in the code in fact) if we wanted to. But since I would like a consistent snapshot of the mailbox state, I lock the cyrus.header and then the cyrus.index and then (if it's there) the cyrus.expunge. That means no sneaky process could (for example) delete the mailbox and create another one with the same name while I was busy downloading the last file - giving me totally bogus data. This is particularly important because I store things by mailbox uniqueid rather than imap path (with pointers from the imap path of course) so that a folder rename turns into a symlink delete (well, replacement with one having an empty target anyway) and a symlink create in the tar file. Bron ( and right now I'm running the process to finish the upgrade from MD5 based to SHA1 based internal identifiers in the backup system, since all our indexes are upgraded ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: v2.3.10 build fails, pcreposix problems
On Wed, Nov 14, 2007 at 12:16:29PM -0500, Larry Rosenbaum wrote: > Upgrading PCRE to v7.4 fixed the problem. (Previous version was v6.5) Excellent - because I was a little surprised otherwise - then again I only tested on Debian Etch because that's all I have handy. Ken - I think the solution is to put: #include directly before #include in sieve/comparator.h I tested that that works fine on my systems, and looks like it will also work on systems with older PCRE that don't do the include themselves. RE: your other question - I guess it would be reasonably easy to add a --disable-pcre to configure so that it never gets tested for or included, even if it is installed. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Thu, Nov 15, 2007 at 01:29:54PM -0500, Wesley Craig wrote: > On 14 Nov 2007, at 23:15, Vincent Fox wrote: > > We have all Cyrus lumped in one ZFS pool, with separate filesystems > > for > > imap, mail, sieve, etc. However, I do have an unused disk in each > > array > > such that I could setup a simple ZFS mirror pair for /var/cyrus/ > > imap so > > that the databases are in their own pools. Or even I suppose a UFS > > filesystem with directio and all that jazz set. > > About 30% of all I/O is to mailboxes.db, most of which is read. I > haven't personally deployed a split-meta configuration, but I > understand the meta files are similarly heavy I/O concentrators. Which is a good argument for checkpointing it (gah, hate that term - it's so non-specific. I've spent some time working on terminology maps for this stuff, and "repack" is the current winner, mainly due to be shorter than the runner up "consolidate") What was I saying again? Oh - yes. Current skiplist metric is that the mailboxes.db has to be be twice the size of the last checkpointed size plus 16k before it re-checkpoints. Given that a checkpoint takes approximately 2 seconds on our systems, and it means that you're not seeking all over the place any more, it would almost certainly be a win. That said, we don't have a single machine where the memory pressure is tight enough to ever push mailboxes.db out of the cache, so it's not ever going to be hitting the disk for reads anyway! Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Mon, Nov 19, 2007 at 08:50:16AM +, Ian G Batten wrote: > > On 17 Nov 07, at 0909, Rob Mueller wrote: >> >> This shouldn't really be a problem. Yes the whole file is locked for the >> duration of the write, however there should be only 1 fsync per >> "transaction", which is what would introduce any latency. The actual >> writes >> to the db file itself should be basically instant as the OS should just >> cache them. > > One thing that's worth noting for ZFS-ites is that on ZFS, you can have > multiple writer threads in a file simultaneously, which UFS can only do for > directio under certain conditions I can't recall. That's a win for > overlapping transactions into a file-based database. We're not hitting > mailboxes.db remotely rapidly enough for this to be an issue, but I can > imagine it being so for big shops. > > In production releases of ZFS fsync() essentially triggers sync() (fixed in > Solaris Next). So if you anticipate a lot of writes (and hence fsync()s) > to mailboxes.db then you don't want mailboxes.db in the same ZFS filesystem > as things with lots of un-sync'd writes going on.I've broken up > /var/imap for ease of taking and rolling back snapshots, but it has the > handy side-effect of isolating delivery.db and mailboxes.db from all the > metadata partitions. Skiplist requires two fsync calls per transaction (single untransactioned actions are also one transaction), and it also locks the entire file for the duration of said transaction, so you can't have two writes happening at once. I haven't built Cyrus on our Solaris box, so I don't know if it uses fcntl there, it certainly does on the Linux systems, but it can fall back to flock if fcntl isn't available. > In my darker moments, by the way, I'm tempted to put deliver.db into tmpfs. > For planned reboot I could copy it somewhere stable, and I could > periodically dump it out to disk. But if I lost it, the consequences > aren't serious, and it's most of the write load through that particular > filesystem. Sounds pretty reasonable to me. >> >> Still, you have a point that mailboxes.db is a global point of contention, >> and it is access a lot, so blocking all processes on it for a write could >> be >> an issue. > > > >> >> Which makes me even more glad that we've split up our servers into lots of >> small cyrus instances, even less points of contention... Yeah, it's nice. It's a pain that the entire mailboxes.db blocks on writes, but it sure keeps the skiplist format simple. I'd be interested to see if there are cases where a transaction is kept open longer than it needs to be though. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Tue, 20 Nov 2007 15:40:58 +1100, "Andrew McNamara" <[EMAIL PROTECTED]> said: > >> In production releases of ZFS fsync() essentially triggers sync() (fixed > >> in > >> Solaris Next). > [...] > >Skiplist requires two fsync calls per transaction (single > >untransactioned actions are also one transaction), and it > >also locks the entire file for the duration of said > >transaction, so you can't have two writes happening at > >once. I haven't built Cyrus on our Solaris box, so I don't > >know if it uses fcntl there, it certainly does on the Linux > >systems, but it can fall back to flock if fcntl isn't > >available. > > Note that ext3 effectively does the same thing as ZFS on fsync() - > because > the journal layer is block based and does no know which block belongs > to which file, the entire journal must be applied to the filesystem to > achieve the expected fsync() symantics (at least, with data=ordered, > it does). Lucky we run reiserfs then, I guess... Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Mon, 19 Nov 2007 22:51:43 -0800, "Vincent Fox" <[EMAIL PROTECTED]> said: > Bron Gondwana wrote: > > Lucky we run reiserfs then, I guess... > > > > > > I suppose this is inappropriate topic-drift, but I wouldn't be > too sanguine about Reiser. Considering the driving force behind > it is in a murder trial last I heard, I sure hope the good bits of that > filesystem get turned over to someone who gives it a future. There are a bunch of people who know a fair bit about it and have been happy to help debug issues, including quite recently. Besides, it's pretty stable now and isn't bitrotting too badly. That said, we're hanging out for btrfs to be stable - It would be nice, and it's sort of inherited a bit from zfs and a bit from reiserfs in its ways of doing things. Bron ( running local Maildirs on it right now, synced with offlineimap to FM. I wouldn't dream of running it production yet - it dies horribly if you ever fill it more than about 70% ) -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: 2.3.10 Upgrade Question
On Tue, 20 Nov 2007 12:05:43 +0100 (CET), "Simon Matter" <[EMAIL PROTECTED]> said: > > In a NON-replicated setup, do the changes to the GUID have an > > impact? Can I just put 2.3.10 on with a quick restart of the > > mailsystem, or is there More To It? > > > > I have 1.7TB of mail, about 40K mailboxes, about 10 million pieces of > > mail. So I don't want to do an upgrade which will kick off some huge > > rebuild-fest without planning. > > I have a server with ~ half the size and upgrade has worked without doing > anything special. The index files are pretty small, and they rebuild fast :) They get streamed into new copies every single expunge anyway. Also, it will only upgrade each index as it is opened for the first time, so the load isn't one giant hit. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: 2.3.10 Upgrade Question
On Tue, 20 Nov 2007 11:54:24 +, "Ian G Batten" <[EMAIL PROTECTED]> said: > > On 20 Nov 07, at 1146, Bron Gondwana wrote: > > > > > The index files are pretty small, and they rebuild fast :) They > > get streamed > > into new copies every single expunge anyway. > > What's involved in the rebuild? I have users with tens of thousands > of messages in a single mailbox, so delaying that open while it reads > and indexes a few gigs is bad. Does the rebuild need to open every > message? Read headers? Bodies? What? No - thankfully everything you need is in the index record itself. If you want the new sha1 GUIDs you need to reconstruct with the -G flag, but that's not required for the upgrade. The upgrade just pads the GUID field with zeros. It's really quick, we had folders with 200k messages in them, and the upgrade was only a couple of seconds on a pretty busy server. It's still only 96 bytes per record or something, so that's pretty much streaming what, 18Mb or something - hardly giant given that it's all sequential reads and writes, and the CPU isn't doing much work. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
FastMail.FM Patchset Updated
As usual you can get the patches here: http://cyrus.brong.fastmail.fm/ I've been busy with Cyrus _again_ - so much for my theory that I was taking a break. OK - here's what's new. * http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-bugfixes-2.3.10.diff http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-robustify-2.3.10.diff Skiplist issues - there were two things that could be found in recovery that actually bit us during the whole "restart every single store with the new skiplist code" project the other day. ADD where the record already existed and DELETE where it didn't. The later also had an obnoxious bug where it would instead delete _the_alphabetically_NEXT_record_ silently. Ooops. I rolled these two into my bugfix and robustify patches, not realising Ken had already applied the previous copies upstream. Ken - do you want me to break this out as a separate patch on top of the others? * http://cyrus.brong.fastmail.fm/patches/cyrus-sync-renamedmailbox-2.3.10.diff DelayedDelete of entire users was causing excess copying. This fixes it, but the solution is less than ideal and causes excess messages about folders not existing during an account create. Annoying. I'd like a better fix, but this is enough for now. Found this one after fixing... * http://cyrus.brong.fastmail.fm/patches/cyrus-deletemode-userfix-2.3.10.diff This is for upstream. I made a bogus design decision in the DelayedDelete code that Ken accepted, and it was causing bailouts and all sorts of yuckyness. Made the conditions for allowing a folder rename into the DELETED. namespace a lot more explicit and correct rather than DELETED.user.foo.TIMESTAMP being considered a user's mailbox! The user cleanup script no longer causes massive bailouts on sync. * http://cyrus.brong.fastmail.fm/patches/cyrus-expunged-nocache-2.3.10.diff A little thing to shut up the issue that used to cause segfaults and now just causes logging instead. Cache offsets in the .expunge file can be bogus for deeper architectural reasons. Rather than fix the underlying reasons I just ignore them completely when running cyr_expire. At least that way we're not reading bogus cache records. * http://cyrus.brong.fastmail.fm/patches/cyrus-fastrename-2.3.9.diff UPDATED. It turns out it really doesn't matter what YOU can see when you're checking if you can use FastRename. It matters if there are subfolders at all. Change to passing isadmin true and not passing the username to mboxlist_count_inferiors(). Also need to check if the target path has inferiors to avoid log messages and partial move failures that have to back out. Much nicer this way. This means fastrename on replicas isn't totally broken any more (before, it would never see the subfolders because the replication user didn't have ACLs on them and isadmin was being set to false explicitly) Ken - I'd love to see the deletemode-userfix and skiplist stuff go upstream. I know you're not happy with fastrename yet, and fair enough - it's an extra risk and if a shutdown happens in the middle of the operation things can get very confused! The other two patches are not really long-term good for the Cyrus codebase so I'd prefer to fix the underlying issues instead :) Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: FastMail.FM Patchset Updated
On Wed, 21 Nov 2007 07:40:21 +0100 (CET), "Simon Matter" <[EMAIL PROTECTED]> said: > Hi Bron, > > Did you consider this one > http://bugzilla.andrew.cmu.edu/show_bug.cgi?id=3006 in the patch above? > From a quick look it seems both patches conflict, is #3006 obsolete now? You're right, they probably do conflict and Ken has applied 3006. I should have checked on that. Hmm... Actually, my patch doesn't even fix #3006. It does sort of suck to be working from a stable past point rather than the latest CVS actually when I do these patches - though it's nice from a stability point of view. OK - Ken, would you like me to re-build this patch on top of current CVS, or would you like to do it yourself? My patch has the replication bugfixes which are still worth doing, but Simon's logic is actually correct for the test while mine isn't (though I renamed the function he's using...) Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: reconstruct -u or -U
On Wed, Nov 21, 2007 at 03:06:39PM -0800, Andrew Morgan wrote: > The changelog says: > >Added -u and -U options to reconstruct -- courtesy of David Carter. > > But I can't find those options listed in the manpage or the built-in help > of reconstruct. What do those options do? The changelog needs to be updated. They got renamed as -g and -G. They either wipe or calculate the message GUID (globally unique identifier, we hope!) which is the sha1 of the message contents now. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Murder in replicated mode
On Wed, Nov 21, 2007 at 09:41:15PM -0300, Diego Woitasen wrote: > Hi! > I'm trying to setup murder in replicated mode. My schema is: > > -Two servers + one shared storage > -Redhat Cluster Suite (RHEL 5.1) with GFS2 working. > -Cyrus 2.3.10 in both servers working. > -spool and sieve directories on GFS > -config dir on local filesystems. > > I have configured Cyrus in replicated mode but when I create an > account in the master server, it isn't replicated unless Cyrus > is restart in either node. It isnt't an authentication problem, > when I create an account with cyradmin mupdate do nothing. >From memory you may have to actually deliver a message to one of the user's folders before it will replicate. At least that's what the docs suggested. Of course we always LMTP deliver a message right at creation time, so we never bothered to check this. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: FastMail.FM Patchset Updated
On Wed, Nov 21, 2007 at 06:37:17AM -0500, Ken Murchison wrote: > Bron Gondwana wrote: >> On Wed, 21 Nov 2007 07:40:21 +0100 (CET), "Simon Matter" >> <[EMAIL PROTECTED]> said: >>> Hi Bron, >>> >>> Did you consider this one >>> http://bugzilla.andrew.cmu.edu/show_bug.cgi?id=3006 in the patch above? >>> From a quick look it seems both patches conflict, is #3006 obsolete now? >> You're right, they probably do conflict and Ken has applied 3006. I >> should >> have checked on that. Hmm... >> Actually, my patch doesn't even fix #3006. It does sort of suck to be >> working >> from a stable past point rather than the latest CVS actually when I do >> these >> patches - though it's nice from a stability point of view. >> OK - Ken, would you like me to re-build this patch on top of current CVS, >> or >> would you like to do it yourself? My patch has the replication bugfixes >> which >> are still worth doing, but Simon's logic is actually correct for the test >> while mine isn't (though I renamed the function he's using...) > > If you have the time. I'm off banging on other things at the moment. Attached are tidied up versions of both my earlier patches on top of what you've already got in CVS. They apply fine on top of current CVS trunk: [EMAIL PROTECTED]:/work/cvs/cyrus-imapd# patch -p1 < /work/cyrus-imapd-2.3.10/patches/cyrus-skiplist-newfixes-2.3.10.diff patching file lib/cyrusdb_skiplist.c Hunk #1 succeeded at 2113 (offset -1 lines). Hunk #2 succeeded at 2140 (offset -1 lines). Hunk #3 succeeded at 2166 (offset -1 lines). Hunk #4 succeeded at 2206 (offset -1 lines). [EMAIL PROTECTED]:/work/cvs/cyrus-imapd# patch -p1 < /work/cyrus-imapd-2.3.10/patches/cyrus-deletemode-userfix-2.3.10.diff patching file imap/mboxname.c patching file imap/mboxlist.c patching file imap/mboxlist.h patching file imap/mboxname.h patching file imap/imapd.c Hunk #1 succeeded at 5006 (offset 2 lines). Hunk #2 succeeded at 5098 (offset 2 lines). Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: FastMail.FM Patchset Updated (new patches)
On Wed, Nov 21, 2007 at 06:37:17AM -0500, Ken Murchison wrote: > Bron Gondwana wrote: >> On Wed, 21 Nov 2007 07:40:21 +0100 (CET), "Simon Matter" >> <[EMAIL PROTECTED]> said: >>> Hi Bron, >>> >>> Did you consider this one >>> http://bugzilla.andrew.cmu.edu/show_bug.cgi?id=3006 in the patch above? >>> From a quick look it seems both patches conflict, is #3006 obsolete now? >> You're right, they probably do conflict and Ken has applied 3006. I >> should >> have checked on that. Hmm... >> Actually, my patch doesn't even fix #3006. It does sort of suck to be >> working >> from a stable past point rather than the latest CVS actually when I do >> these >> patches - though it's nice from a stability point of view. >> OK - Ken, would you like me to re-build this patch on top of current CVS, >> or >> would you like to do it yourself? My patch has the replication bugfixes >> which >> are still worth doing, but Simon's logic is actually correct for the test >> while mine isn't (though I renamed the function he's using...) > > If you have the time. I'm off banging on other things at the moment. I guess I should attach them! Bron. Fix Delayed Delete replication Candidate for upstream - now respun on top of Simon Matter's allowusermoves fixes from CVS Deleting user.foo was broken on replicas thanks to me choosing a bad method for making it work. Also, it was leaving foo.TIMESTAMP.sub files because it considered that to be a valid "user". Oops. This patch takes a different approach (on top of the delayed delete code already in Cyrus 2.3.10): 1) mboxname_isusermailbox() no longer returns true for DELETED.user.foo.TIMESTAMP 2) The "can I rename this" code checks for the target name being either another username or something in the DELETED. namespace. 2a) mboxname_isdeletedmailbox() rather than the somewhat (IMHO) misplaced mboxlist_in_deletedhierarchy(). 3) if "forceuser" is passed (only done by sync_server), then all ACL checks are completely bypassed, so the rename will always succeed on the replica. Index: cyrus-imapd-2.3.10/imap/mboxname.c === --- cyrus-imapd-2.3.10.orig/imap/mboxname.c 2007-11-22 01:22:08.0 -0500 +++ cyrus-imapd-2.3.10/imap/mboxname.c 2007-11-22 01:29:17.0 -0500 @@ -600,35 +600,47 @@ { const char *p; const char *start = name; -const char *deletedprefix = config_getstring(IMAPOPT_DELETEDPREFIX); -size_t len = strlen(deletedprefix); -int isdel = 0; /* step past the domain part */ if (config_virtdomains && (p = strchr(start, '!'))) start = p + 1; -/* step past any deletedprefix */ -if (mboxlist_delayed_delete_isenabled() && strlen(start) > len+1 && - !strncmp(start, deletedprefix, len) && start[len] == '.') { - start += len + 1; - isdel = 1; /* there's an add'l sep + hextimestamp on isdel folders */ -} - /* starts with "user." AND * we don't care if it's an inbox OR - * there's no dots after the username OR - * it's deleted and there's only one more dot + * there's no dots after the username */ -if (!strncmp(start, "user.", 5) && - (!isinbox || !strchr(start+5, '.') || - (isdel && (p = strchr(start+5, '.')) && !strchr(p+1, '.' - return (char*) start+5; /* could have trailing bits if isinbox+isdel */ +if (!strncmp(start, "user.", 5) && (!isinbox || !strchr(start+5, '.'))) + return (char*) start+5; else return NULL; } /* + * If (internal) mailbox 'name' is a DELETED mailbox + * returns boolean + */ +int mboxname_isdeletedmailbox(const char *name) +{ +static const char *deletedprefix = NULL; +static int deletedprefix_len = 0; +int domainlen = 0; +char *p; + +if (!mboxlist_delayed_delete_isenabled()) return(0); + +if (!deletedprefix) { + deletedprefix = config_getstring(IMAPOPT_DELETEDPREFIX); + deletedprefix_len = strlen(deletedprefix); +} + +if (config_virtdomains && (p = strchr(name, '!'))) + domainlen = p - name + 1; + +return ((!strncmp(name + domainlen, deletedprefix, deletedprefix_len) && + name[domainlen + deletedprefix_len] == '.') ? 1 : 0); +} + +/* * Translate (internal) inboxname into corresponding userid. */ char *mboxname_inbox_touserid(const char *inboxname) Index: cyrus-imapd-2.3.10/imap/mboxlist.c === --- cyrus-imapd-2.3.10.orig/imap/mboxlist.c 2007-11-22 01:29:03.0 -0500
Re: Murder in replicated mode
On Thu, Nov 22, 2007 at 11:01:30PM -0300, Diego Woitasen wrote: > On Thu, Nov 22, 2007 at 12:50:47PM +1100, Bron Gondwana wrote: > > On Wed, Nov 21, 2007 at 09:41:15PM -0300, Diego Woitasen wrote: > > > Hi! > > > I'm trying to setup murder in replicated mode. My schema is: > > > > > > -Two servers + one shared storage > > > -Redhat Cluster Suite (RHEL 5.1) with GFS2 working. > > > -Cyrus 2.3.10 in both servers working. > > > -spool and sieve directories on GFS > > > -config dir on local filesystems. > > > > > > I have configured Cyrus in replicated mode but when I create an > > > account in the master server, it isn't replicated unless Cyrus > > > is restart in either node. It isnt't an authentication problem, > > > when I create an account with cyradmin mupdate do nothing. > > > > >From memory you may have to actually deliver a message to one of > > the user's folders before it will replicate. At least that's what > > the docs suggested. Of course we always LMTP deliver a message > > right at creation time, so we never bothered to check this. > > > > Bron. > > I tried with that, but doesn't work. I delivered a message in both > serves and nothing. Again, the mailbox was replicated when I restart > Cyrus. > > Other idea? > > May be I should start to read the mupdate code ... :) Is rolling replication actually running? As in - do existing mailboxes continue to be replicated? Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Restricting admin logins
On Thu, Nov 29, 2007 at 03:54:29PM +0100, Alain Spineux wrote: > On Nov 29, 2007 3:15 PM, Andy Fiddaman <[EMAIL PROTECTED]> wrote: > > > > At the moment we patch the Cyrus IMAP server source so that administrators > > (admins in the config file) can only log in from certain IP addresses. > > > > I was wondering if there is a better way to do this or whether some means > > of achieving this is planned for future releases? > > Yes have 3 imapd.conf, all common option in one imapd_common.conf > and @include this file in the two other with different admins options > Then start two different port and some firewall rules to achieve your need. Hey, that's a pretty funky idea :) We use a nginx proxy with an authentication daemon which rejects all login attempts as admin. Our imap machines are firewalled so that the only ways you can talk to them are imap or pop via the nginx proxy or send incoming emails to our mxes which will inject them via lmtp to the spam scanning machines which do the final delivery. I do like the different configs for a simpler network layout in a smaller system though. Very clever! Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus upgrade from 2.1.18 to 2.2.13 moved email messages
On Sun, Dec 02, 2007 at 08:51:51PM +0100, Steinar Bang wrote: > > Steinar Bang <[EMAIL PROTECTED]>: > > > Steinar Bang <[EMAIL PROTECTED]>: > > Sebastian Hagedorn <[EMAIL PROTECTED]>: > > What previously was mail/s/user/sb/ is now mail/u/s/user/sb/ > > Here's what I think happened. > > I've had this setting since upgrading from 1.5.19 to 2.1.11 in 2002: > > > hashimapspool: true > > > Ie. it's not new. > > So when I ran this command meant for an upgrade from 1.5.* to 2.* things > where messed up: > > > $ /usr/lib/cyrus/upgrade/rehash basic > > The rehash script expected a 1.5 structure to work with, and when I fed > it a 2.1 structure, it moved the wrong directories in the wrong place. > > So the question is what I can do to fix it...? > > If I move > mail/u/user/s/sb / > to > mail/s/user/sb/ > would that fix things...? Yes. rehash isn't the nicest script in the whole world. I have a much nicer one that's part of the userhash patch at FastMail. Unfortunately it doesn't (yet) completely clean up after itself so I need to do a bit more work before I'm happy to recommend it as a complete replacement. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: DBERROR
On Thu, Dec 06, 2007 at 05:56:29PM +0100, Alain Spineux wrote: > On Dec 6, 2007 5:03 PM, Jeff Blaine <[EMAIL PROTECTED]> wrote: > > bash-2.05# ls -l *db > > -rw--- 1 cyrusmail 144 Dec 5 16:56 annotations.db > > -rw--- 1 cyrusmail 144 Dec 5 16:56 deliver.db > > -rw--- 1 cyrusmail 144 Dec 5 16:56 mailboxes.db > > 144 bytes ! Not a lot ! It is precisely the size of an empty skiplist file. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus on Solaris at universities?
On Thu, Dec 13, 2007 at 11:46:19AM +0200, Janne Peltonen wrote: > On Thu, Dec 13, 2007 at 08:26:03AM +0100, Rudy Gevaert wrote: > > Vincent Fox wrote: > > > Just wondering what other universities are runing Cyrus on Solaris? > > > > > > We know of: > > > CMU > > > UCSB > > > > We have run it, but switched to GNU/Linux about one year and a half ago. > > Same here, except the switch was half a year ago (and long due). > > This said, the main reason we switched to Linux wasn't that there were > anything wrong with Solaris - but we only have one staff member these > days that really knows his way around Solaris, whereas there are many > people with good to excellent Linux competence. The other reason I'm really happy with Linux is that I don't think I could post a casual "we've got this weird issue with a recent release that didn't exist before" to an unrelated topic on a Solaris list and have the Linus equivalent respond within half an hour explaining what has probably caused it, giving a couple of patches to try, and pulling the other main "owners" of that block of code into the discussion. And then others on the list guiding me through converting those snippets of code into a supportable, maintainable patch that adds a /proc toggle to alter the behaviour of the kernel for what we need. It will probably be in .25 if it survives the -mm process. Bron ( nowhere near as much as the amount of code I've written for Cyrus this year though! ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus on Solaris at universities?
On Thu, Dec 13, 2007 at 04:22:45PM -0800, Vincent Fox wrote: > Bron Gondwana wrote: > > (Linux comments) > > 'Twas not my intent to start a this-OS versus that-OS comparison. > Valid though that is, it's a different thread. > > Like most sites, we have various OS in operation here, it just > happens that the Cyrus backends are Solaris. The test project > here started with spare Sun hardware after all. And you know > after working with ZFS for a while, dealing with fsck on a large > mail volume is something I'm very glad to leave behind. Oh yeah - I'll be glad when that's gone as well. That's the biggest downside to using Linux - and if I had the time and resources I'd be tempted to try a couple of Solaris-on-Intel installs on a couple of machines and see if it was workable. I know some people are quite happy on FreeBSD as well - Cyrus really is quite portable C. (I think John Capo uses FreeBSD?) Still, after using Linux (especially Debian) package management and userland, maintaing a Solaris machine really does feel like a step back into the dark ages in some respects. Having thousands of well packaged applications at your fingertips is pretty handy, and it's all well integrated. I do have one ZFS machine, and I don't use it to anywhere near its capabilities - it's just backups. Bron ( advocacy 'r' us! ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Plugging into the imap system
On Sat, Dec 22, 2007 at 04:51:46PM -0300, Diego M. Vadell wrote: > Hi Gabriele, >If you are using linux, maybe you can use inotify-tools to notify you of > any change in cyrus' spool. Be aware that the files on disk are created _before_ the index record, so you need to wait or poll until the index record has had a chance to be created. Even if you capture the "append" event on the cyrus.index file itself, you still won't see the new record until the 'exists' record in the index header is updated. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Plaintext only for loopback?
On Sun, Jan 13, 2008 at 01:59:48AM -0500, Chris Pepper wrote: > Hello, > > I want to allow plaintext auth only for SquirrelMail (running on the > Cyrus IMAPd server), and require encrypted authentication over all > physical network connections. I see several options governing plaintext > auth in the documentation for imap.conf: Run two imapd instances from cyrus.conf, one on a high port that you firewall from everywhere but the squirrelmail server, and the other config on the standard port deny plaintext. Then just point squirrelmail at the high port in its config. You just need to specify "-C /etc/imapd-sm.conf" or something for the squirrelmail one. Personally I would generate both from a template stored in version control. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Plaintext only for loopback?
On Sun, Jan 13, 2008 at 07:09:25PM -0800, Phil Pennock wrote: > It's been a little while since I've done this, so I'm not absolutely > sure of the details, but if memory serves ... > > Run two different IMAP services from cyrus.conf: > > SERVICES { > imap cmd="imapd" listen="imap.example.org:imap" prefork=1 > imaplocal cmd="imapd" listen="localhost:imap" prefork=2 > } > > In imapd.conf you can prefix a configuration directive with the service > name, where the service name is exactly what you specified in SERVICES; > > imaplocal_allowplaintext: 1 > > All possibly wrong, as I say; worth a try though. Yeah, that's probably actually a better idea than my one in an environment where you don't have pervasive version control and templated config file generation in place, otherwise it would be too easy for the two files to get out of sync when someone hand edits one of them - with all the debugging nightmare and strange effects that would follow. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: segfault of imapd and pop3d
On Tue, Jan 15, 2008 at 09:18:27AM +0100, Michael Menge wrote: > Hi, > > I found some segmentation faults of my "imapd -s" and "pop3d -s""proceses > in my logs.Has anybody seen this before. I'm running cyrus 2.3.8 on an > SLES10 x86_64. > > Jan 14 09:37:44 mailserv08 kernel: imapd[1898]: segfault at > 2b9042669978 rip > 004805d4 rsp 7fffdf6b51c0 error 4 > Jan 14 10:11:44 mailserv08 kernel: imapd[32213]: segfault at > 2b1c29b16978 rip > 004805d4 rsp 781f9d00 error 4 > Jan 14 10:13:47 mailserv08 kernel: imapd[30415]: segfault at > 2b6440346978 rip > 004805d4 rsp 7fffe19d4360 error 4 > Jan 14 10:17:51 mailserv08 kernel: imapd[30420]: segfault at > 2adb7dd67978 rip > 004805d4 rsp 7fffa3fb3ab0 error 4 > Jan 14 10:53:18 mailserv08 kernel: imapd[2072]: segfault at > 2ae8ce23e978 rip > 004805d4 rsp 7fff5368d190 error 4 > Jan 14 11:07:14 mailserv08 kernel: pop3d[3402]: segfault at > 2b29a2f6a978 rip > 0043fe4c rsp 7fff7e8fc510 error 4 > Jan 14 11:33:41 mailserv08 kernel: imapd[3797]: segfault at > 2b5a2e971978 rip > 004805d4 rsp 733aaeb0 error 4 > Jan 14 12:00:20 mailserv08 kernel: pop3d[5015]: segfault at > 2ad8925f3978 rip > 0043fe4c rsp 7fff8f72b340 error 4 > > The last corresponding entry in cyrus logs is always "accepted connection" > > Jan 14 11:33:41 mailserv08 imaps[3797]: accepted connection > Jan 14 12:00:20 mailserv08 pop3s[5015]: accepted connection Sounds like a corrupted mailbox to me - there's heaps of stuff where cyrus mmaps a file and blindly goes adding offsets read from it to the mmap base and accessing the memory there. Recipe for segfaults if I ever saw one. Do you know which user is logging in when this happens? They're rare enough that I assume it's not every user on the box! Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: segfault of imapd and pop3d
On Tue, 15 Jan 2008 16:33:50 +0100, "Michael Menge" <[EMAIL PROTECTED]> said: > Quoting Bron Gondwana <[EMAIL PROTECTED]>: > > > Syslog is logging Cyrus at debug level, and the last line for that > process is > "accepted connection". Normaly this line is followd by a STARTTLS line > and a LOGIN line. > proc/pid shows the host they are connected with. These IPs are > different for most processes and i see other succesfull logins from > these hosts. > > So i don't think it's a corrupt mailbox. Maybe the tls_sessions.db is > corrupt. > Is there a way to check a skiplist if the file is corrupted? Personally I like sudo -u cyrus cyr_dbtool $file skiplist show > /dev/null This does a "foreach" over the entire file which should find any issues. If there was an exposed way to run "myconsistent" on the file that would be nicer but the cyrus DB interface doesn't expose all the internals and I've been loath to fiddle with it since there are a few different DB modules and I'd have to change all of them. Of course if you can shut the cyrus instance down you could probably just delete that file. Bron. P.S. I've attached another external tool that can find some interesting things. It needs the attached module to be in your perl lib somewhere too. -- Bron Gondwana [EMAIL PROTECTED] skiplist_detail.pl Description: Perl program Skiplist.pm Description: Perl program Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Move to new server
On Tue, 19 Feb 2008 14:13:38 +0100, "Paul van der Vlis" <[EMAIL PROTECTED]> said: > Adam Tauno Williams schreef: > >> I want to move all mail to a new server. Old server has Cyrus 2.1.18 > >> (Debian Sarge), new server has Cyrus 2.2.13 (Debian Etch). > >> In the past, I just copied all files in > >> /var/spool/cyrus/ > >> /var/lib/cyrus > >> But, is this a good way? > > > > It probably works. > > That's true, but maybe I keep old database-formats ? > > >> Alternative is imapcopy. But I see you need a list of all users and > >> passwords. That's a lot work to make (650 users). > > > > Or just connect as a user with administrative access. We did a > > migration with imapcopy, no need to know all the user's passwords. > > > >> Isn't it possible to use the admin-user to copy everything? > > > > Yep. > > Nice to hear, thanks! > > Can I use something like this in ImapCopy.cfg ? > > # > # List of users and passwords > # > # SourceUserSourcePassword DestinationUser DestinationPw > Copy"cyrus" "cyruspw""cyrus" "cyruspw" Does this copy all seen information as well? Seen is per-user in Cyrus, so you won't see it if the admin user does the copying. (and I hate losing seen information!) Bron ( yes, I do have an alternative to suggest ) -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Endgame: Cyrus big install at UC Davis
On Tue, 19 Feb 2008 12:50:09 -0800, "Vincent Fox" <[EMAIL PROTECTED]> said: > So for those of you who recall back that far. > > UC Davis switched to Cyrus and as soon as fall quarter started > and students started hitting our servers hard, they collapsed. > Load would go up to what SEEMED to be (for a 32-core T2000) > a moderate value of 5+ and then performance would fall off a cliff. > People would be getting timed out, overall it was REALLY bad > here for several days, lots of pressure > > We are running Solaris10 u4 and using a ZFS pool for the mail store. Have you read this? http://blogs.sun.com/roch/entry/when_to_and_not_to I was forwarded there via: http://rlwinm.blogspot.com/2007/10/my-parity-iz-pastede-on-yay.html which I got to via a slashdot comment pointed out on a mailing list, etc, etc. Anyway, it looks quite interesting. Thankfully our only use of ZFS at the moment is the backup machine where each user's backup consists of just two files (a sqlite DB and a .tar.gz). I've already posted to this list at length about how it works (very well, thankyou very much) and the nice side effect that it detects file corruption automatically by recalculating the GUID by doing a sha1 on the file as it goes into the tar, so it can find any underlying issues in the cyrus spools. Actually, every so often I'm tempted to turn the sqlite database back into a flat file and gzip that as well - it would cost a bit more to read, but the IO would _all_ be streaming then, and we have plenty of memory. We can still just append the changes. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: GUID change to SHA1
On Thu, Feb 21, 2008 at 05:11:48PM +0100, Martin Schweizer wrote: > Hello > > I use FreeBSD 6.3 and cyrus 2.3.11. Below is the manual for the change. > > Upgrading from 2.3.9 > > * The method used for generating Globally Unique IDentifiers used > for replication has been changed to be the SHA1 >hash of the messages. If you wish to upgrade the existing GUIDs > in particular mailbox(es) or the entire server, >perform the following steps in the listed order. Note that is > is NOT REQUIRED that existing GUIDs be upgraded. > 1. Zero GUIDs on the replica (reconstruct -g) > 2. Regenerate GUIDs on the master (reconstruct -G) > 3. Regenerate GUIDs on the replica (reconstruct -G) > > Which is the master and which is the replica? Server 1 or Server 2? > [...] > Server 1: > syncclient cmd="/usr/local/cyrus/bin/sync_client -r" Master > Server 2: > syncservercmd="/usr/local/cyrus/bin/sync_server" listen="csync" > prefork=0 Replica Enjoy, Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Deleted cyrus.* files
On Thu, Feb 21, 2008 at 11:26:49AM +0100, David Flegl wrote: > Hi, > > > try reconstruct from command line. > > 1. login as cyrus. > > 2. /usr/local/cyrus/bin/reconstruct -r user/bad.user > > Thank's for a reply. I've tried but no effect. Reconstruct said: > $>/usr/local/cyrus/bin/reconstruct -r user/[EMAIL PROTECTED] > domain.cz!user.bad^user: Mailbox has an invalid format > > and when I've tried this (without domain): > $>/usr/local/cyrus/bin/reconstruct -r user/bad.user > $> > Command has no response. And no log information. Try this: /usr/local/cyrus/bin/reconstruct -rf domain.cz\!user/bad.user Reconstruct uses the "internal" mailbox format, which is domain.name\!user/username rather than user/[EMAIL PROTECTED] Regards, Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Abusing the sync protocol for fun and profit.
On Thu, 21 Feb 2008 09:20:34 -0600, "Dan White" <[EMAIL PROTECTED]> said: > Bron Gondwana wrote: > > Attached are three perl modules, > > > > Cyrus/SyncClient.pm > > Cyrus/ImapReplica.pm > > Mail/IMAPTalk.pm > > > > I'm including this copy of Mail::IMAPTalk because without it, the clever > > 'literal' stuff doesn't work properly. I'll prod Rob to clean it up and > > re-package it and push it to CPAN so I can depend on that version and > > have things all be happier. > > Thanks Bron, > > This works great for me. I'm able to synchronize between my old > 2.1.17 server, with a perdition proxy frontend end, to my newer > 2.3.10 server. Excellent, that's what I like to hear :) > I had a hiccup in the SyncClient.pm module during DIGEST-MD5 > authentication. > > I changed to PLAIN, using the following changes, to get it working: Wow, that wouldn't work for us at all. I did have to put -p 1 on the syncserver command line in cyrus.conf before it would let me authenticate at all, and nothing but DIGEST-MD5 worked for me. Also, Authen::SASL::Cyrus worked fine, but then the connnection was encrypted and I had to try and pipe all the IO through it as well, which I couldn't be bothered with making pipe nicely. > [diff] Thanks for that. I probably should make it try both in order or something funky like that. Maybe an "auth_digestmd5" and an "auth_plain" function which are tried in that order. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Again: GUID change to SHA1
On Mon, Feb 25, 2008 at 05:52:34PM +0100, Martin Schweizer wrote: > Hello > > 2008/2/21, Bron Gondwana <[EMAIL PROTECTED]>: > > > On Thu, Feb 21, 2008 at 05:11:48PM +0100, Martin Schweizer wrote: > > > Hello > > > > > > I use FreeBSD 6.3 and cyrus 2.3.11. Below is the manual for the change. > > > > > > Upgrading from 2.3.9 > > > > > > * The method used for generating Globally Unique IDentifiers used > > > for replication has been changed to be the SHA1 > > >hash of the messages. If you wish to upgrade the existing GUIDs > > > in particular mailbox(es) or the entire server, > > >perform the following steps in the listed order. Note that is > > > is NOT REQUIRED that existing GUIDs be upgraded. > > > 1. Zero GUIDs on the replica (reconstruct -g) > > > 2. Regenerate GUIDs on the master (reconstruct -G) > > > 3. Regenerate GUIDs on the replica (reconstruct -G) > > > > > > Which is the master and which is the replica? Server 1 or Server 2? > > > > > [...] > > > Server 1: > > > > > syncclient cmd="/usr/local/cyrus/bin/sync_client -r" > > > > > > Master > > > > > Server 2: > > > > > syncservercmd="/usr/local/cyrus/bin/sync_server" > listen="csync" prefork=0 > > > > > > Replica > > > Thanks for the hint. I was not sure about the terms. > > On the master I get after the changes (used of reconstruct...) the > following output > > grep sync /var/log/debug.log > > Feb 22 09:09:03 acsvfbsd02 sync_client[73967]: DIGEST-MD5 client step 1 > Feb 22 09:09:03 acsvfbsd02 sync_client[73967]: DIGEST-MD5 client step 2 > Feb 22 09:09:03 acsvfbsd02 sync_client[73967]: DIGEST-MD5 client step 3 > Feb 22 09:15:09 acsvfbsd02 sync_client[73967]: seen_db: user astomas > opened /var/imap/user/a/astomas.seen > Feb 22 09:19:04 acsvfbsd02 sync_client[74023]: DIGEST-MD5 client step 1 > Feb 22 09:19:04 acsvfbsd02 sync_client[74023]: DIGEST-MD5 client step 2 > Feb 22 09:19:04 acsvfbsd02 sync_client[74023]: DIGEST-MD5 client step 3 > > Is that correct? This I got also before I did the changes. Sorry about the delay in replying. That just looks like debugging info. We don't get it, so I wonder if you have a debugging level turned on that we don't. The important question is: are your messages being replicated? Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Fwd: Again: GUID change to SHA1
On Wed, 5 Mar 2008 07:01:13 +0100, "Martin Schweizer" <[EMAIL PROTECTED]> said: > ( Sorry,again but I did not get an answer until now) OOps, I thought I had replied... > > Sorry about the delay in replying. That just looks like debugging info. > > We don't get it, so I wonder if you have a debugging level turned on > > that we don't. > > > Here my imapd.conf from the "client": > [...] > sasl_log_level: 7 I'd say that's what is causing it. Any reason you have this turned up so far? > > The important question is: are your messages being replicated? > > > Yes, the messages will replicated all the time. Cool - I'd say you're fine then. Nothing to worry about! Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: delayed expunge/delete with replication?
On Mon, Mar 10, 2008 at 07:53:52AM -0700, David R Bosso wrote: > Hello, > > I've got a few questions about how delayed expunge works with replication > in 2.3.11. > > 1. If I want delayed expunge & delete on both the replica and master > (replicated delayed expunge & delete), do I need them turned on in both > master and replica imapd.conf, or is it sufficient that they be on in just > the master? > > 2. Is it sufficient to run cyr_expire (-D -X) the master - will it be > replicated, or does it need to be run on the replica also? Both on the replica as well. The replication protocol doesn't know about expunged messages at all. This is a source of occasional frustration to me, because it means if you do anything remotely funky (like a restore) on the master, then the replica will have the same record in both the .index and .expunge files and it all goes to pot. But finding the time to rewrite the protocol to support both .index and .expunge contents being streamed to the replica sounds an awful lot like real work, so we haven't done anything about it. Bron ( but if you're using delete-rename for folders, turn that off on the replica. The replica will just get a "rename" event from the master and it's all good ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Migrate all to skiplist?
On Wed, Mar 12, 2008 at 12:02:30PM -0400, Shelley Waltz wrote: > > I am migrating my 200 users from a RHAS3 cyrus-imapd-2.2.3 install to a > RHAS5 cyrus-imapd-2.3.7 install. > > I have been happy with the current setup with the exception of issues with > the Berkeley DB for mailboxes and deliver. Recovering a corrupt > maiboxes.db has been extememly slow, taking on the order of 6hrs to > recover from the flat file dump. On occasion I have had to delete the > deliver.db and restart. > > Reading through many posts, is there any reason to not use skiplist for > all the databases? Although I have 200 users, at any given time, only > half are actively using their account. Our traffic is light for the most > part. > > The 2.3.7 defaults as listed in "man impad.conf" seem to indicate > that berkeley-nosync is used for duplicate and tlscache and berkeley for > ptscache(?) Flatfile for subscription and quotalegacy for quota. > > This will be a single server with a single replica. Are there any issues > with not using skiplist for all under this setup? I wouldn't use skiplists in 2.3.7 on general principles. The code was chock full of bugs for ages (ask the Kolab guys about their experience with it). Almost all the fixes got into 2.3.11, though I would also recommend applying: http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-safelock-2.3.11.diff http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-state-2.3.11.diff and http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-transactions-2.3.10.diff on top of 2.3.11. With these, I have been unable to crash skiplists with all my nasty tests any more. Bron ( somewhat an expert on that module of the Cyrus code now - I've spent a lot of time reading it! ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Is there any way to log/see protocol level commands for a mailbox which is not under user.* ?
On Thu, Mar 13, 2008 at 11:57:31AM +0100, Ciprian Marius Vizitiu wrote: > Hi listers, > > > > It's written in the manuals that by creating a folder under > /var/lib/imap/log/username Cyrus will log at protocol level details for > "username". Question: how can I do the same for a mailbox which is above > user.* level? Of course I could enable logging for all users =:-o and then > try to correlate those logs but I thought I should ask. > > > > Background: on a perfectly functioning Cyrus IMAP some of my users are... > abusing one common IMAP folder in subtle ways so I just want to be able to > catch the offenders that's why I'm only interested in the "COPY" and/or > "APPEND" commands. My "Auditlog" patches would probably be pretty handy for that! You would get log entries for every append and copy, associated via sessionid to a login event. You'll need both: cyrus-sessionid-2.3.11.diff and cyrus-auditlog-2.3.11.diff from http://cyrus.brong.fastmail.fm/ (and you'll have to build cyrus from source or a custom package for your packaging system as appropriate) you also need to add 'auditlog: yes' to the imapd.conf. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: reconstruct doing nothing
On Sat, 22 Mar 2008 11:29:36 +0100, "Alain Spineux" <[EMAIL PROTECTED]> said: > On Fri, Mar 21, 2008 at 8:59 PM, Rudy Gevaert <[EMAIL PROTECTED]> > wrote: > > Gabor Gombas wrote: > > > On Fri, Mar 21, 2008 at 04:57:18PM +0100, Bart Coninckx wrote: > > > > > >> Gabor, is this patch relevant when I do get a result onscreen from > > >> "reconstruct"? > > > > > > Without the patch, "reconstruct -r" processes only the exact mailbox > > > given on the command line but does not descend into subfolders, like > > > when the "-r" switch was not given at all. At least that's the case with > > > my configuration. > > > > Some time ago I noticed the same, but some time after that it did > > recurse. Anyway, doing reconstruct -rfx user/first.lastname/[EMAIL > > PROTECTED] > > reconstructs the sub folders too. > > > > I thing '*' and '%' are used as wildcard in the list of know > mailboxes, I means the mailbox.db > > Then > > reconstruct user/first.lastname/[EMAIL PROTECTED] #withount -r > > display all known sub folder of mailbox [EMAIL PROTECTED] except > Inbox itself > > Then to repair an inbox and all its folders 2 commands are required ! > > reconstruct user/[EMAIL PROTECTED] > reconstruct user/first.lastname/[EMAIL PROTECTED] > > I suppose > > reconstruct -r user/[EMAIL PROTECTED] #withount * but with -r > > SOULD do the same. But is not working for me. The -f doest help more if > the mailbox is already knwon ! > > If I copy my Sent folder into a new Foo folder then run (create a new > mailbox without > telling to cyrus), then use > > reconstruct -f user/[EMAIL PROTECTED] # without a -r but > (work same with -r ) > > It display > > discovered domain.com!first.last.Foo > > > Conclusion > > > - -r looks to be useless > - -f discover yet unknow folder, recursively too, but only inside new > folder, not if already known, use * to for a full discovery in two > time user/[EMAIL PROTECTED] and > user/first.lastname/[EMAIL PROTECTED] > - '*' and '%' allow to walk around the mailbox tree, but only inside > already know folder > > This was tested on a 2.3.11 Try this: reconstruct -r 'domain.com!user/first.lastname' (yay internal representations leaking) -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: ctl_mboxlist virtual memory exhausted !
On Tue, 25 Mar 2008 09:31:40 +0100, "Brasseur Valery" <[EMAIL PROTECTED]> said: > Hi, > > I am running 2.3.11, with 2 Millions users (4M mailboxes ;-) > > when trying to do a ctl_mboxlist -m, after some time (a few second !) I > got a "virtual memory exhausted", and i can see that the process is > allocating more than 3Gb of memory ! Ouch. That hurts > did some of you encontered this ? > any way to bypass ? We split our Cyrus instance up into 300Gb data partitions. We currently have 56 stores (112 partitions thanks to replication). Obviously you need infrastructure to manage this, and some form of frontend proxy to direct user logins to the correct store (we use nginx). Further, any users who need to share mailboxes must be on the same store. Still, things are a lot faster when your average mailboxes.db is only 4.5Mb in size (having just checked the one for the store my mailbox is on) > I also got a lot's of skiplist corruption when file size is around 700Mb > for mailboxes.db, and mupdate process getting 100% of CPU when it's > arrive !!! > any ideas are welcome ! Are you using: http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-safelock-2.3.11.diff http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-state-2.3.11.diff http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-transactions-2.3.10.diff Finally, are you running a 32 bit operating system? With a 700Mb mailboxes.db being mmaped, you might be pushing close to the available process memory space. Running a 64 bit kernel would probably help a lot there (you will of course need to have 64 bit hardware!) I would seriously recommend against having 2 million users all on one machine on disaster recovery principles - it takes far too long to copy that much data onto modern drives, so if you lose your drive unit then getting users back up and running looks like about a week's sitting there watching data copy. Yes, I have done that before. That's why we run partitions that can rebuild from scratch in 6 hours now. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sieve forwarding loop destroys e-mail
On Mon, Mar 31, 2008 at 04:21:20PM +0200, Alain Spineux wrote: > On Mon, Mar 31, 2008 at 2:40 PM, Joseph Brennan <[EMAIL PROTECTED]> wrote: > > > > Jo Rhett <[EMAIL PROTECTED]> wrote: > > > > > I would ask that you spend some time determining how the > > > program could determine it is a bad rule, and provide a patch to fix this > > > behavior. (in short -- it's harder than you think) > > > > A mail delivery system that loses mail is buggy. I don't need to look > > at the code to know that. > > > > You can tell me no one has time to fix it, and in an open source project > > I can respect that. But it is a bug. > > Look at this: > > If my script is > > redirect [EMAIL PROTECTED] > > I expect my mailbox to stay empty, because this is what redirect is > supposed to do! > If I found and email in my mailbox this is a BUG, because the script I wrote > should never let an email come in! I know, I know - pick me. How about this one? discard; It turns out that a mail delivery system that has been configured in a way that loses mail has a bug _in_the_person_who_configured_it_. Now it may be that the language makes it easy to shoot yourself in the foot, but that's different from being buggy. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sieve forwarding loop destroys e-mail
On Mon, 31 Mar 2008 15:51:17 -0700 (PDT), "Andrew Morgan" <[EMAIL PROTECTED]> said: > On Tue, 1 Apr 2008, Bron Gondwana wrote: > > > On Mon, Mar 31, 2008 at 04:21:20PM +0200, Alain Spineux wrote: > >> On Mon, Mar 31, 2008 at 2:40 PM, Joseph Brennan <[EMAIL PROTECTED]> wrote: > >>> > >>> Jo Rhett <[EMAIL PROTECTED]> wrote: > >>> > >>> > I would ask that you spend some time determining how the > >>> > program could determine it is a bad rule, and provide a patch to fix > >>> > this > >>> > behavior. (in short -- it's harder than you think) > >>> > >>> A mail delivery system that loses mail is buggy. I don't need to look > >>> at the code to know that. > >>> > >>> You can tell me no one has time to fix it, and in an open source project > >>> I can respect that. But it is a bug. > >> > >> Look at this: > >> > >> If my script is > >> > >> redirect [EMAIL PROTECTED] > >> > >> I expect my mailbox to stay empty, because this is what redirect is > >> supposed to do! > >> If I found and email in my mailbox this is a BUG, because the script I > >> wrote > >> should never let an email come in! > > > > I know, I know - pick me. How about this one? > > > > discard; > > > > > > It turns out that a mail delivery system that has been configured in a > > way that loses mail has a bug _in_the_person_who_configured_it_. Now > > it may be that the language makes it easy to shoot yourself in the foot, > > but that's different from being buggy. > > Just for reference - we provide a web interface (custom, we wrote it) > that > provides the features most people want to configure in their sieve rules > such as email forwarding, filtering based on From/To address, vacation > messages, and spam blocking. Of course, they have no idea it is actually > sieve behind the scenes. They just point and click the web interface. > > This web interface has sanity checks to prevent people from doing silly > things like forwarding mail to themselves or the other common email > aliases on their accounts. > > We also offer direct sieveshell access for users that ask if they can do > more than the web interface offers. If these "smart" users shoot > themselves in the foot, oh well. Sounds remarkably like what we have, except we don't provide a timsieved that listens to the world - people have to paste their sieve scripts into a web interface that does syntax tests before uploading it. Mainly for proxying reasons, we don't have anything set up that can proxy the sieve protocol, and we don't allow direct connections to our backend servers. Yeah, sieve is a weird language in some ways, but it mostly gets the job done and it's the "cyrus way". We could probably get much the same by delivering to plus addresses from our perl lmtp proxy, but why re-design the wheel? Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sync_client error Hit upload limit 0
On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote: > Dear List > > I get the following type of error (see below) during replication that > appeared after upgrading from 2.3.8 to 2.3.11. > > This occurs occasionally and yet the emails are synced and it occurs for > various user accounts. > > I have noticed that this error only occurs when an account is getting a > burst of emails or when several users are getting what seems to be the > same spam email. > > Is there a timing error? > > I have a rolling replication set to a delay of 5 seconds - should I change > the interval? Wow, there shouldn't be a limit of 0. unsigned max_count = config_getint(IMAPOPT_SYNC_BATCH_SIZE); if (max_count <= 0) max_count = INT_MAX; Well, that's somewhat bogus anyway, since it's unsigned. May as well be == 0. But still - I can't see how it could become zero! syslog(LOG_NOTICE, "Hit upload limit %d at UID %lu for %s, sending", max_count, index_list->last_uid, mailbox->name); BAH - upload_messages_from() is broken. Will reply shortly with a patch, Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sync_client error Hit upload limit 0
On Wed, Apr 02, 2008 at 09:51:50PM +1100, Bron Gondwana wrote: > On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote: > > I get the following type of error (see below) during replication that > > appeared after upgrading from 2.3.8 to 2.3.11. > > BAH - upload_messages_from() is broken. > > Will reply shortly with a patch, Ken - CCing you on this one since you'll want this for CVS. I have compile tested this - haven't rolled it out anywhere, but it's pretty trivial. Regards, Bron. Index: cyrus-imapd-2.3.11/imap/sync_client.c === --- cyrus-imapd-2.3.11.orig/imap/sync_client.c 2008-04-02 10:56:52.0 + +++ cyrus-imapd-2.3.11/imap/sync_client.c 2008-04-02 10:57:56.0 + @@ -1358,7 +1358,7 @@ static int upload_messages_list(struct m struct sync_index_list *index_list; unsigned max_count = config_getint(IMAPOPT_SYNC_BATCH_SIZE); -if (max_count <= 0) max_count = INT_MAX; +if (!max_count) max_count = INT_MAX; if (chdir(mailbox->path)) { syslog(LOG_ERR, "Couldn't chdir to %s: %s", @@ -1432,6 +1432,8 @@ static int upload_messages_from(struct m struct sync_index_list *index_list; unsigned max_count = config_getint(IMAPOPT_SYNC_BATCH_SIZE); +if (!max_count) max_count = INT_MAX; + if (chdir(mailbox->path)) { syslog(LOG_ERR, "Couldn't chdir to %s: %s", mailbox->path, strerror(errno)); Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sync_client error Hit upload limit 0
On Wed, Apr 02, 2008 at 07:19:07AM -0400, Ken Murchison wrote: > Bron Gondwana wrote: >> On Wed, Apr 02, 2008 at 09:51:50PM +1100, Bron Gondwana wrote: >>> On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote: >>>> I get the following type of error (see below) during replication that >>>> appeared after upgrading from 2.3.8 to 2.3.11. >>> BAH - upload_messages_from() is broken. >>> >>> Will reply shortly with a patch, >> Ken - CCing you on this one since you'll want this for CVS. >> I have compile tested this - haven't rolled it out anywhere, >> but it's pretty trivial. > > I understand the missing check for max_count's value, but is there a reason > why you're not checking that its negative? unsigned Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sync_client error Hit upload limit 0
On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote: > Apr 2 10:31:14 brooks sync_client[23049]: Hit upload limit 0 at UID > 101775 for user.XXX, sending Btw - you can get the same effect as applying the patch by putting the following in your imapd.conf: sync_batch_size: 4294967295 Yeah, that magic number is (2^32 - 1) INT_MAX in other words. This is what the patch does anyway. (or you can set it somewhere middling. We set it to 1000) Regards, Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sync_client error Hit upload limit 0
On Wed, Apr 02, 2008 at 07:46:33AM -0400, Ken Murchison wrote: > Bron Gondwana wrote: >> On Wed, Apr 02, 2008 at 07:19:07AM -0400, Ken Murchison wrote: >>> Bron Gondwana wrote: >>>> On Wed, Apr 02, 2008 at 09:51:50PM +1100, Bron Gondwana wrote: >>>>> On Wed, Apr 02, 2008 at 11:28:07AM +1030, Stephen Carr wrote: >>>>>> I get the following type of error (see below) during replication that >>>>>> appeared after upgrading from 2.3.8 to 2.3.11. >>>>> BAH - upload_messages_from() is broken. >>>>> >>>>> Will reply shortly with a patch, >>>> Ken - CCing you on this one since you'll want this for CVS. >>>> I have compile tested this - haven't rolled it out anywhere, >>>> but it's pretty trivial. >>> I understand the missing check for max_count's value, but is there a >>> reason why you're not checking that its negative? >> unsigned > > Ah, right. Since config_getint() returns signed, we should probably make > max_count signed Sounds reasonable - then I'm happy with <= 0 :) Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Sync_client error Hit upload limit 0
On Thu, 3 Apr 2008 13:11:22 +1030 (CST), "Stephen Carr" <[EMAIL PROTECTED]> said: > Dear Ken > > I killed master and restated it but forgot to kill sync_client so it may > have been running the unpatched version. Sounds viable. If you're starting sync_client from outside your cyrus instance then you need to manage it with your init script as well. We actually do this with a wrapper tool, but the underlying concept is there. > I have manually restarted sync_client and after 30+ minutes had no > "errors" and in that period I know one user got 8 emails at almost the > same time. > > It also seems the sync_client in 2.3.11 is more stable than in 2.3.8. > > I have a cronjob to restart the sync_client if it "died" which usually > happened 2 or 3 times a day, the new version has not had to be restarted > except to get the patched version to run. Yeah, I'm not surprised. A bunch of us have spent a fair bit of time on tracking down the bugs in replication and squashing them one-by-one! We (FastMail) only see bailouts when there's something really bogus like a corrupt cache file (or when a replica gets shut down with the sync still running of course). Glad you seem sorted :) Bron ( been off Cyrus patching for a while now... was time to take a break and work on something else ) -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Patch to avoid mailboxesdb corruption on concurrent renames
I've put a header on the patch describing the bug. Basically, the result code from mailbox_open_locked() wasn't being tested sufficiently, and hence the new mailbox name would be created in mailboxes.db even though the files were no longer available to be copied - causing sync bailouts and IOERRORs and all sorts of fun. What causes this? Clients that send a RENAME or DELETE event and then you do something else in the GUI and they open a _separate_ connection which tries to do something else to the source folder. I think it is our only remaining source of bailouts! It requires a reconstruct to fix when it happens. Patch attached, and the perl script I used to confirm that it existed and confirm that this patch fixed it. Regards, Bron ( P.S. isn't it about time for a 2.3.12? I'm getting sick of posting skiplist patches to people running the lastest and having issues! ) #!/usr/bin/perl -w use IO::Socket::INET; $| = 1; our $ADDR = '127.0.0.1:143'; our $USER = '[EMAIL PROTECTED]'; our $PASS = 'FIXME'; my $source = prepare_source(); my %h; foreach my $n (1..3) { my $pid = fork(); unless ($pid) { my $res = do_rename($source, $n); print "RES: $n => $res\n"; exit(); } $h{$pid} = 1; } foreach my $pid (keys %h) { waitpid($pid, 0); } dump_folders(); sub prepare_source { my $sourcename = "INBOX.source$$"; my $fh = IO::Socket::INET->new($ADDR); $fh->getline(); $fh->print(". login $USER $PASS\r\n"); $fh->getline(); $fh->print(". delete $sourcename\r\n"); # just in case! $fh->getline(); $fh->print(". create $sourcename\r\n"); $fh->getline(); print "$sourcename: "; foreach my $n (1..1) { my $msg = gen_msg($n); my $len = length($msg); $fh->print(". append $sourcename {$len}\r\n"); $fh->print("$msg\r\n"); print "." unless $n % 50; } print " done\n"; $fh->print(" . bye\r\n"); $fh->getline(); return $sourcename; } sub do_rename { my $sourcename = shift; my $n = shift; my $fh = IO::Socket::INET->new($ADDR); $fh->getline(); $fh->print(". login $USER $PASS\r\n"); $fh->getline(); $fh->print(". delete $sourcename-dest$n\r\n"); # just in case! $fh->getline(); $fh->print(". rename $sourcename $sourcename-dest$n\r\n"); my $res = $fh->getline(); $fh->print(" . bye\r\n"); $fh->getline(); return $res; } sub gen_msg { my $n = shift; return qq{Subject: Message $n From: <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> This is a test message made long by shoving a whole pile of lines on it! } . ("$n\n" x 10); } sub dump_folders { my $fh = IO::Socket::INET->new($ADDR); $fh->getline(); $fh->print(". login $USER $PASS\r\n"); $fh->getline(); $fh->print("TAG1 list \"INBOX.*\" \"*\"\r\n"); $fh->getline(); while (my $res = $fh->getline()) { last if $res =~ m{^TAG1 }; print $res; } $fh->print(". bye\r\n"); $fh->getline(); } Handle concurrent mailbox renames safely *CANDIDATE FOR UPSTREAM* About the last cause of sync bailouts at FastMail has been the case where a user starts renaming a folder, and then does something else (including deletes with renames here, since we have the deleterename option enabled) This was caused by the mailbox_rename_copy function not checking the return code of mailbox_open_locked properly (probably due to it being a huge function with multiple different styles of using the if (!r) error checking paradigm in different places, but my can be bothered on major refactors is low at the moment) The side effect of this was that if the mailbox existed going in (copy was still underway) then the old mailbox would get deleted (noop, or corruption if you don't have all the skiplist patches applied!) and the new mailbox name created but with no files in place! This patch checks the return code of mailbox_open_locked and aborts with an IO error if the mailbox has already been locked. Index: cyrus-imapd-2.3.11/imap/mboxlist.c === --- cyrus-imapd-2.3.11.orig/imap/mboxlist.c 2008-04-04 00:59:36.0 + +++ cyrus-imapd-2.3.11/imap/mboxlist.c 2008-04-04 02:47:46.0 + @@ -1339,11 +1339,15 @@ int mboxlist_renamemailbox(char *oldname if(!r) { r = mailbox_open_locked(oldname, oldpath, oldmpath, oldacl, auth_state, &oldmailbox, 0); - oldopen = 1; + if (r) { + goto done; + } else { + oldopen = 1; + } } /* 6. Copy mailbox */ -if (!r && !(mbtype & MBTYPE_REMOTE)) { +if (!(mbtype & MBTYPE_REMOTE)) { /* Rename the actual mailbox */ r = mailbox_rename_copy(&oldmailbox, newname, newpartition, NULL, NULL, &newmailbox, Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Patch to avoid mailboxesdb corruption on concurrent renames
On Fri, Apr 04, 2008 at 07:10:27AM -0400, Ken Murchison wrote: > Bron Gondwana wrote: > >> Bron ( P.S. isn't it about time for a 2.3.12? I'm getting sick >>of posting skiplist patches to people running the >>lastest and having issues! ) > > Yes, it probably is. Perhaps I'll make a pre-release today while I troll > bugzilla for any showstoppers. That would be nice. Also, I'll check it against our patch list and see if there's anything bugfixy in there that I haven't pushed to you yet. By the way - would you prefer me to push things through Bugzilla? So far I've just been posting patches to the mailing list, but I'm happy to use whatever process is easiest for you. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire -E ?
On Sat, Apr 19, 2008 at 09:19:40AM +0100, David Carter wrote: > On Fri, 18 Apr 2008, David R Bosso wrote: > > > I don't specify a -X, I just want to prune the duplicate db. What am I > > doing wrong? > > -X expunge-days > Expunge previously deleted messages older than expunge-days > (when using the "delayed" expunge mode). The default is 0 > (zero) days, which will expunge all previously deleted messages. > > Try -X . cyr_expire is a bit overloaded. That's something from upstream rather than from the FastMail patches. I can see that it's unwanted behaviour, but by the same token I accept the logic behind it: a) don't break current installations (you can't require a -X parameter) b) have as similar behaviour to the current as possible (never deleting mail by default would fill up people's drives with deleted spams, besides which the performance hit would suck over time, having those huge cyrus.expunge files sitting around) c) there is no (c) d) oh yeah, adding a "cyr_expire_expunge_default_keep_forever = yes" flag to cyrus.index is ugly and extra complex and not that much different from -X $INTMAX anyway. Or something like that, Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire -E ?
On Sun, 20 Apr 2008 07:26:54 -0400, "Steve Huston" <[EMAIL PROTECTED]> said: > On 4/19/08 8:46 AM, Blake Hudson wrote: > > I haven't looked at the source, but couldn't another flag be added that > > would mimic the old behavior of only pruning the duplicate db? I would > > assume that would be cleaner/faster than the proposed -X $INTMAX method > > that would have to compare a bunch of timestamps... > > Doesn't that exist in the form of "expunge_mode: immediate" in > /etc/imapd.conf? I never had an -X flag in my cyr_expire commandline > until recently when I set expunge_mode to delayed and added the flag > myself. cyr_expire_expunge_mode: immediate hmm - that looks viable to me! Feel like testing it? Basically it tells cyr_expire that you have immediate expunge, so don't bother expunging, but everything else will do delayed delete. Don't come crying when your mailboxes melt under the load of huge spam databases whenever you delete a message though... actually, it's not so bad since cyrus_expunge doesn't get sorted every time, the records just get appended. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
FastMail.FM patchset updated against 2.3.12
On Mon, Apr 21, 2008 at 07:35:01AM -0400, Ken Murchison wrote: > I am pleased to announce the release of Cyrus IMAPd 2.3.12. This > release should be considered production quality. http://cyrus.brong.fastmail.fm/ Notable changes: * cyrus-findall-txn-2.3.12.diff Creates a mboxlist_findall_txn function which takes a transaction, allowing you to use it within a transaction. This is a little "shutting the barn door" now that we have skiplist fixes for the same issue, but I think it's still valuable from a correctness point of view. * cyrus-fastrename-2.3.12.diff Update the fastrename patch to use mobxlist_findall_txn when searching for inferiors and throughout all the rename and delete paths for mailboxes. * cyrus-folder-limit-2.3.12.diff Basically just updates the folder limit patch to use the new API from the previous two patches, passing the transaction down. Other than this, it's basically the same old patches refreshed against 2.3.12, and of course everything that's been accepted upstream is removed from our series now. Testing status: I've built and run this on our testbed machine, including a rename test harness, but it hasn't been run on production machines yet. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus IMAPd 2.3.12 Released
On Wed, 23 Apr 2008 10:33:27 +0200, "Thomas Robers - TuTech Innovation GmbH" <[EMAIL PROTECTED]> said: > Ken Murchison schrieb: > > I am pleased to announce the release of Cyrus IMAPd 2.3.12. This > > release should be considered production quality. > > I'm getting a segmentation fault when I try to start the master process. > Compiling went fine on Debian etch (2.6.18-5-xen-amd64) with gcc version > (GCC) 4.1.2 20061115 (prerelease) (Debian 4.1.1-21) > > [...] > > open("/var/imap/imapd.conf", O_RDONLY) = 3 > fstat(3, {st_mode=S_IFREG|0640, st_size=1366, ...}) = 0 > mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) > = 0x2b6c1cebe000 > read(3, "# (21.02.2008)\n## directori"..., 4096) = 1366 > --- SIGSEGV (Segmentation fault) @ 0 (0) --- > +++ killed by SIGSEGV +++ > Process 26096 detached This would be even more useful: % gdb /opt/imap/libexec/master > run -d [ wait for it to segfault ] > bt > Compiling and running version 2.3.11 with the same options on the same > machine works fine. Are you able to post your /var/imap/imapd.conf file? I'm guessing there's something in there which is causing the segfault. Regards, Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus IMAPd 2.3.12 Released
On Wed, Apr 23, 2008 at 05:58:27PM +0400, Dmitriy Kirhlarov wrote: > Sebastian Hagedorn wrote: > > --On 23. April 2008 15:37:19 +0400 Dmitriy Kirhlarov <[EMAIL PROTECTED]> > > wrote: > > > >> Attached patch add to log information about moving messages between > >> folders. I am using this information from logs for relaunch dspam. > >> Any chances for add this patch to project tree? > > > > FWIW, logging this at LOG_ERR level certainly isn't the right way to do > > that ... I'd say it should be INFO at best, if not DEBUG. > > And with this correction, patch can be included to cyrus imapd repo? Have you looked at: http://cyrus.brong.fastmail.fm/patches/cyrus-auditlog-2.3.11.diff It's a very detailed logging system which logs all create, delete, append, copy, expunge, unlink, etc events. Anything which changes a mailbox or message (but not metadata events like flag changes at the moment). It also logs noteworthy sieve events. It logs everything at LOG_NOTICE. If there are other users for it, I'm happy to put some effort into making auditlog acceptable for upstream, and possibly generalising it to allow logging of different classes of events. We use it to populate a database which is linked with events from the various login systems and the email tracking information from our Postfix and lmtp setups - it's not finished yet, but the plan is to be able to track the entire lifecycle of every message we receive (probably only for a few weeks to save space!) so users can see what's happening with their emails. Regards, Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus IMAPd 2.3.12 Released
On Wed, Apr 23, 2008 at 03:37:19PM +0400, Dmitriy Kirhlarov wrote: > If dspam miss, user can manually move message from|to "spam" folder. This > fact fixed in cyrus log file. simple script parsing log and relaunch dspam. > +syslog(LOG_ERR, "DSPAM-Hack index_copysetup(): %s -> %s, hdr %s", > mailbox->name, > +copyargs->name, index_getheader(mailbox, msgno, > "X-DSPAM-Signature")); Wow - that's pretty tricky. I see you're actually logging a specific header as well. Funky. We don't have anything like that in our auditlog patch. I'd still suggest that this should be a generic mechanism rather than a hard-coded header if it's going in upstream. Something like auditlog_headers: X-DSPAM-Signature X-Something-Else which would cause a log line: auditlog: copy oldmailbox= mailbox= olduid=<1234> uid=<7> guid=<478920478932fabed74398943243> x_dspam_signature= x_something_else= Of course there would need to be quoting support since these headers could contain <>. Oh the humanity. My favourite is URI encoding because it's really quite simple to parse, but I'm sure everyone has their favourite. What do you think? Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus - GFS slow start and poor performace
On Wed, 14 May 2008 10:59:33 +0200, "Maurizio Lo Bosco" <[EMAIL PROTECTED]> said: > Hi all, > I know that the usage of the GFS has been discussed for long time on this > mailing list but I would like to know if it is normal to have a very slow > start (15 minutes) with just 4300 users (the cyrus db is composed of > 20940 > lines). > It happens only with the GFS and the skiplist database; using the flat it > takes few seconds to start. > The system is composed of 2 IBM x3550 with redhat enterprise linux 5.1 > attached to a SAN IBM DS4700 with an dual fibre channel (4Gb/s multipath > active-backup). > > The dump of the database takes 7 minutes but the disk usage is definitely > low > (less than 5%) > RedHat is saying that there is no way to optimise the performance on the > GFS > locking architecture and they will now take a look to the cyrus code. > > Do you have any tips? Skiplist mailboxes.db gets a "recovery" run on it at startup. The recovery visits all the records in the file. That said, it does it all in order. Can you post the syslog output of cyrus as it starts (slowly)? I wonder if it's also doing a checkpoint, which visits all the mailboxes.db records as well, but does them in... oh, indeed. Recovery also writes back pointers all over the place. It does LOTS of writes to random locations within the file. If GFS is doing something insane like writing back the entire file to the server for every single update (generally 4 bytes at a time) then this could be a big problem! That said, the file is locked with a fcntl (flock if fcntl isn't available) lock over the entire file +append space. This is an exclusive lock, and it's held for the entire recovery run. If GFS's locking can say "just do the updates and save copying back until the fsync at the end" then that should speed it up. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus - GFS slow start and poor performace
[reply number 2 - addressing bits I missed in the first reply...] On Wed, 14 May 2008 10:59:33 +0200, "Maurizio Lo Bosco" <[EMAIL PROTECTED]> said: > The dump of the database takes 7 minutes but the disk usage is definitely > low > (less than 5%) A dump of the database visits all records in alphabetical order. This can result in somewhat random looking seeks around the file due to the layout of a skiplist, but it will happen within the mmap. > RedHat is saying that there is no way to optimise the performance on the > GFS > locking architecture and they will now take a look to the cyrus code. You may want to pass on the RedHat engineers that Cyrus uses an MMAP of the entire file to read all records, and uses seeks and direct writes the same fd (or a different fd depending on compilation settings) to write. Skiplist appends entire records to the file, but also seeks back and updates pointers (4 byte records) within the file with each update. That's writing. Reading - it reads each record, gets a pointer to the location of the next one, and reads from the memory location that corresponds to db->map_base + pointer_offset. Depending on your requirements, it may make sense to place your mailboxes.db on local disk (it's pretty small) and regularly copy/rsync it onto your GFS partition. Worst case you lose a couple of mailboxes.db records in a crash. Depends what you can afford to lose. You could probably stat the file every second and copy it on any change pretty cheaply and risk losing at most the last second's changes (it doesn't change often) Regards, Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_server crashing..
On Fri, 16 May 2008 19:18:04 + (GMT), "Andy Fiddaman" <[EMAIL PROTECTED]> said: > > Hi, > > I'm hoping that someone who is familiar with the replication code can > help me with a problem I'm seeing with Cyrus 2.3.12. > > I have a two server replicated setup and sync_server is occasionally > crashing. Once it's crashed once it keeps on crashing until I completely > reset replication by snapshotting the master, removing the sync logs > and rsyncing the snapshot to the replica. > > The crash is happening in sync_cacheitem_size() > > Core was generated by `sync_server'. > Program terminated with signal 11, Segmentation fault. > > [...] > > This last itemlen pushes the pointer out of the allocated memory and > causes the crash. > > Any ideas on whether these entries look right and where I should look > next > to debug it? You have a corrupted cache file. Various things could have caused this, it isn´t easy to know what it was. Your fix works because once you rsync, there is no mention of the folder with the problem in the sync log any more, however next time anything happens on that folder you get the crash again. 1) figure out what folder it is 2) reconstruct it 3) profit??? Enjoy, Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_server crashing..
On Sun, 18 May 2008 21:44:45 + (GMT), "Andy Fiddaman" <[EMAIL PROTECTED]> said: > On Sun, 18 May 2008, Bron Gondwana wrote: > > ; You have a corrupted cache file. Various things could have caused > this, > ; it isn´t easy to know what it was. > > Thanks. > > I can't find the corrupted mailbox so I've run reconstruct on everything, > rsyncd the master to replica again and I'll see if the problem recurs > (you > can tell it isn't a massive mailstore!) > > I tried to find the mailbox with the problem by writing a quick program > to > scan through each cache file and it didn't detect any errors. I also ran > mbexamine on every mailbox with no problems so I don't know where the > corruption, if any, was. > > Keeping my fingers crossed anyway, Hmm.. by corrupted cache file it could actually be the cache base pointers from the cyrus.index that are corrupted. One cause was delayed expunge and reconstruct, but David Carter wrote some patches which got into 2.3.12 to fix that, so new reconstructs will be fine. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: are these error messages severe? and how to fix them?
On Mon, 19 May 2008 21:07:55 +0200 (CEST), "Simon Matter" <[EMAIL PROTECTED]> said: > > On Mon, May 19, 2008 at 6:31 PM, <[EMAIL PROTECTED]> wrote: > >> greetings all. > >> > >> This morning a user called me saying that he was using reading his email > >> (via squirrelmail) in one computer, then he logged out, and some time > >> later > >> went to another computer open squirrelmail, and his mail was gone > >> > >> I checked directly in the mailstore and he had only a couple of > >> messages, > >> but he assures me that he did not delete anything > >> > >> After some search on the log files, I found something like this: > >> > >> May 19 09:56:05 ccaix imaps[8619]: skiplist: recovered > >> /var/lib/imap/user/C/user^name.seen (3 records, 7316 bytes) in 0 seconds > > I think skiplist files are always "recovered" when they are opened. So > that is not a sign of anything wrong. Yeah, all that means is that the timestamp of the skiplist file is earlier than the timestamp of the last time cyrus was started. A "recovery" just goes through the file and makes sure that all the pointers are correct. That message is harmless. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus POP access restritcion to users
On Tue, May 20, 2008 at 08:57:14PM +0530, Ashay Chitnis wrote: > Hi, > > I wanted to know if we can restrict some users to access POP and allow some > users to access POP. I do not want to have firewall based restriction. I am > using cyrus-imapd-2.3.7-4. The same users should be allowed through Webmail > without any issue. The users are LDAP users. Can anyone help me on this? We do exactly this at FastMail, but we use a different approach. All user connections are via an nginx proxy, and the authentication daemon used by nginx will return an error if the user tries to log in via POP. It will also send them an email explaining the policy and offering them an option to upgrade to an account level that does support POP... Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Protection against POP or IMAP Denial of Service (DOS)
On Wed, May 21, 2008 at 12:32:33AM +0200, Stéphane BERTHELOT wrote: > But being recently attacked many times especially on POP3 service I am > looking for some advice or maybe making a feature request for some more > protection against DOS. Gosh, I seem to be spending a lot of time pimping nginx here! We get protection against this sort of DOS for free (as well as load balancing and etc) by having frontend servers running nginx as a proxy. Nginx is compiled (on our 2.6.x kernels) with epoll support, so it can handle bazillions of connections with the 8 processes it's configured to use. It also handles SSL (so the backend IMAP machines don't need to) and deals with the connection up until the point where the user is authenticated, at which stage it performs a login on the backend server and links the connection through. Compared to Perdition which was one-process-per-connection, this has scaled amazingly well. One medium spec machine can easily handle (checks) about 7000 connections at the moment, and it scales to a lot more than that during the US day. That's with HTTP proxy, authenticated SMTP injection, ftp server, lots of other things - and the frontend machine is still barely using one of the 4 processor cores in it. You could easily put nginx on your IMAP server directly if you didn't want to dedicate a second machine to the job, and it would handle the DOS risk for you. I like this approach from a UNIX design perspective. One service that is designed for coping with DOS attacks and talking to the outside world, and a separate service that is designed purely for actually providing the service, rather than complicating it with DOS accounting and tracking mechanisms. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Protection against POP or IMAP Denial of Service (DOS)
On Wed, 21 May 2008 07:13:10 +0200, "Christiaan den Besten" <[EMAIL PROTECTED]> said: > Bron, > > What does the authentication for nginx for you, since it can't query > for example a ldap directly ( at least, not the last time I checked )? > The epoll will scale, but wondering what is the most 'light' method to > do the actual authentication .. Perl, it's the swiss cheese^H^H^H^H^H^Harmy knife of tools. Specifically, we have this funky little thing that's increasingly inaccurately named "saslperld". It's just forking Net::Server derivative that listens to unix sockets. It currently talks the following protocols: * lookup * mux * nginx * perdimap * perdpop * vfs Ok - so we don't use either of the perdition ones any more, they should probably get removed in the cleanup I'm planning to do later this week (while working on one time password, openid, other goodies). "lookup" is a simple key value protocol allowing usernames to be resolved to our internal userids. It's used by log analysis tools. "mux" is the saslauthd protocol. Some sort of packed struct format from memory. "nginx" is the nginx http authentication protocol "vfs" is also very badly named. It's the protocol that I originally wrote for handling our vfs interfaces (DAV & FTP) but has since expanded to be used by our web interface and every other bit of code that wants to check user authentication details, because the protocol is so easy to use from our perl libraries. The overhead of unix sockets really is very low, and being separate processes means any epoll thingy (looking a DJabberd soon hopefully) can chat to it asynchronously without having to do its own thread pool. It also listens on a UDP port for broadcast cache expiry events and caches user details to reduce database traffic for protocols with frequent short-lived logins. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Status of Cyrus replication
On Thu, May 22, 2008 at 02:17:02PM -0500, Blake Hudson wrote: > Hey all, last time I checked replication was undergoing major overhauls > and incompatibility between minor versions of 2.3.x was pretty great. > There were also a few bugs that could potentially cause trouble down the > road. I've had the need to create setups with failover servers and have > continued using rsync on an interval (~30 to 60 min) for this purpose. > Unfortunately this causes quite a lot of IO load on the servers and I > was hoping that a rolling replication setup would help resolve this. Yeah, it would! Are you using rsync 3.0? It doesn't help with the IO load, but at least it's a bit more incremental about things. Also, you can get huge performance wins with a tiny bit of custom code, something like this hunk of untested perl: while (readdir(DH)) { if (m/^cyrus\./) { # rsync this file, could have changed arbitrarily } elsif (m/^\d+\.$/) { # this is a cyrus message file, if it exists on the replica then # no need to try and sync } elsif (! m/^\./) { # this is a subfolder, sync it. } } Basically, you don't need to stat the message files, which are the bulk of your data. ... but that's still a lot of custom protocol development and stuff. Annoying. > What's the status of Cyrus replication in the latest releases of 2.3.x - > specifically with virtual domains enabled? It's getting pretty good actually. Most of our replication errors for the last couple of weeks have been traced back to a bug in our automated user-move code, which meant it failed to add a "USER $foo" to the sync log after moving users to new servers - so moved users who had no activity were not replicated. > It also seems like there have been some problems with the latest > releases of 2.3 and I'm hesitant to upgrade my 99% working 2.3.1 > install. Any lingering issues or reason not to upgrade? There were some bad times in there. The only outstanding bug I'm aware of in 2.3.12 is the blank lines in config file segfault - you'll either see that straight away or not at all! > For those who have the need to create a "hot spare" server and are not > using Cyrus replication, what method are you guys using to accomplish > this goal? Our backup system (not quite the same!) uses a perl module which reads the folder records from mailboxes.db and then uses fcntl locks on the cyrus.* files in each folder to block out cyrus while it streams the cyrus.* files. These are then backed up, and also parsed to see what message files are indexed - this is compared against what has already been fetched, and any new messages are also fetched and stored. It's blindingly quick through intimate knowledge of Cyrus's internals. In the best case, no matter how big the folder, it costs only two stats (cyrus.header and cyrus.index, we don't bother backing up cyrus.cache since it's all derived information). If either of them has changed we stream the contents of them both. Only then if there are new message files do we cause any IO on the data partition, and that is direct filename opens. No readdirs ever. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: seen db
On Tue, Jun 10, 2008 at 12:02:44PM +0200, Rudy Gevaert wrote: > Hi, > > I'm seeing this in my logs > > mail5r/syncserver[19755]: seen_db: user [EMAIL PROTECTED] opened > /mail/mail5r/var/imap/domain/u/ugent.be/user/n/nick^andries.seen > mail5r/master[12683]: process 19755 exited, signaled to death by 11 > mail5r/master[12683]: service syncserver pid 19755 in BUSY state: > terminated abnormally > > Deleting the seen file on the replica, or reconstructing doesn't help. > I need to delete the mailbox on the replica and resync it. > > It's only for certain users, so I don't think it has to do with my > upgrade from sarge to etch. (I brought down my lun on sarge machine, > and brought it up on the etch machine). I'm running 2.3.12p2 on sarge > and etch. > > An other downside is that my replication hangs on that user. > sync_client bails out, and restarts but with that user... So he keeps > retrying. > > I would appreciate further help in debugging the problem. Are you running a 64 bit kernel? (just wondering - we have hit pretty much the same issue - and were wondering about dodgy kernel issues being a proble - it's only one machine that seems to have corrupted seen files, only on replicas) We've been running 2.3.12 for about a week, and it's only last night that we had anything funny show up at all. Interestingly, it's probably the first time cyr_expire ran on 2.3.12 just before that - and also the first time our check-replication script was running, which loads a lot of seen files on BOTH ends. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: seen db
On Tue, 10 Jun 2008 15:29:01 +0200, "Rudy Gevaert" <[EMAIL PROTECTED]> said: > Bron Gondwana wrote: > > > Are you running a 64 bit kernel? > > Yes, but the system is 32bit (I run 64bit kernel + 32 emulation support) Interesting, so do we (on etch as well) > > (just wondering - we have hit pretty much the same issue - and were > > wondering about dodgy kernel issues being a proble - it's only one > > machine that seems to have corrupted seen files, only on replicas) > > > > We've been running 2.3.12 for about a week, and it's only last night > > that we had anything funny show up at all. > > > > Interestingly, it's probably the first time cyr_expire ran on 2.3.12 > > just before that - and also the first time our check-replication > > script was running, which loads a lot of seen files on BOTH ends. > > Here cyr_expire has been running on 2.3.12 for a couple of weeks. But > here the first time too with the 64bit kernel. There you go. We've had the 64bit kernel approximately forever, but only just upgraded from 2.6.20 series to 2.6.25. > I can try with a 32bit kernel tomorrow. > > In attachment a strace to show where it segfaults Almost certainly boring, since it's file corruption. The file itself would be significantly more interesting. My guess - you'll be finding little blocks of (small n)*4 bytes which happen to be NULL. It's when they intersect with the pointers table that things get interesting. Oh - can you tell me. Did the file checkpoint sometime not too long before it got corrupted? I've got a small set of theories, but I'm reading the skiplist source code (again!) to see if they make sense... Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: seen db
On Wed, Jun 11, 2008 at 10:52:31AM +0200, Rudy Gevaert wrote: > Bron Gondwana wrote: >> There you go. We've had the 64bit kernel approximately forever, but only >> just upgraded from 2.6.20 series to 2.6.25. >> >>> I can try with a 32bit kernel tomorrow. > > Unfortunate with the 32bit kernel 2.6.24-2 it sync_server still segfaults. Try a 2.6.20 kernel, just for an interesting datapoint. We changed back to 2.6.20 (64 bit still) and haven't seen a corrupted seen file since. >> Oh - can you tell me. Did the file checkpoint sometime not too long before >> it >> got corrupted? > > The cases I saw it did. Ditto here. Interesting. They also had quite long records, but I don't know how common that is. Lots of little bits of seen spread around the space. >> I've got a small set of theories, but I'm reading the skiplist source code >> (again!) to see if they make sense... >> >> Bron. > > I'm also wondering if what would happen if I brought up a master. Surely > the imap processes would also segfault. Right? If it was on those corrupted files, yes. On that machine - quite probably. If you can afford the hardware it may be worth testing. (hmm, I can possibly dedicate a 64 bit capable machine to testing this. If it's a kernel bug I'd love to reproduce it) > Here I can delete the mailbox on the replica and sync again. As a > reconstruct doesn't help. We find reconstructing helps now - but that's with the 2.6.20 kernel. There were multiple things going wrong before. We originally suspected the external drive unit was playing up, but I'm thinking kernel now. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: seen db
On Wed, 11 Jun 2008 15:07:02 +0200, "Rudy Gevaert" <[EMAIL PROTECTED]> said: > Bron Gondwana wrote: > > > Try a 2.6.20 kernel, just for an interesting datapoint. We changed > > back to 2.6.20 (64 bit still) and haven't seen a corrupted seen file > > since. > > I hope to try that still today. > > I'm now running on 2.6.24-2, 32bit. I have cleaned up the users that > were having a corrupted mailbox on replica. Surprisingly I can count > them on both hands. > > So now I'm again running with rolling replication and I'm doing a > sync_client session for each user. When that is finnished I'll try to > downgrade the kernel. > > Btw, I tested my sarge-> etch upgrade in a xen virtual machine, 64bit > kernel + 32 bit userspace. But this was 2.6.18. > > I'm still wondering if I should run 2.6.20 in 32bit or 64bit... It's been fine for us as 64bit for a while now. Though note - 64bit will allow lots more process space, which allows broken cache files to REALLY SCREW WITH YOU. Bah. We have 4Gb core dumps being written into our cores directory - and let me tell you, while something is dumping core it uses some trick which totally nukes all other IO on the same device. It gets ioniced up there really happy. Ouch. The cause - mailbox_cache_size hits a bogus "length" field and returns like 1.7Gb as the size of the record. This then causes an xrealloc to "size * 2", or 3.4Gb. At least in the case of one mailbox that's been causing us fun. In a second I'll gdb that awfully large core and figure out which mailbox is the culprit. One reconstruct later > >>> Oh - can you tell me. Did the file checkpoint sometime not too long > >>> before it > >>> got corrupted? > >> The cases I saw it did. > > > > Ditto here. Interesting. They also had quite long records, but > > I don't know how common that is. Lots of little bits of seen > > spread around the space. > > I'm not sure how I would see that? I'm not familiar with the internals > of skiplist. I find they show up pretty well as [EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@ in less. The skiplist format doesn't have many all zero blocks otherwise. Lots of other special characters show up for binary bits. Sadly, I can pretty much read a hexdump of a skiplist. Sad because that's a lot of braincells that could be doing something useful like absorbing alcohol. I've written a little patch for the mailbox_cache_size issue that returns 0 if the result ever looks like it's going negative or more than 100 million bytes. Then sync_support is patched to treat a zero cache size as "say we failed to reserve this message". It will do for now... Bron ( also found a theoretical bug in the skiplist code and patched it today, but I might fix the whole function before I submit it upstream. I say theoretical because I don't see that the codepath gets exercised unless you already have a corrupt file, so meh ) -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Is skiplist dependant on byte order?
gt; > far as I can tell, the annotation_db, duplicate_db, and tlscache_db > > are empty and can simply be removed. Are there any others on a murder > > front end that I've missed? Where do they reside? Yeah, we nuke all those on restart. duplicate_db is the most interesting of that lot - but not a giant concern. It will cause vacation messages to be repeated, and duplicate messageids to be delivered if it's gone - that's about all. For a once-off I wouldn't be at all concerned. mailboxes.db really is the big one. Anything else with berkeley named in it that's either in your imapd.conf or defaulted that way in lib/imapoptions. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Linux kernel bug AMD64 - affects skiplists
I promised I'd have something to say about skiplists soon! (hi Rudy - hope you had a good time off, leaving me here to figure this out _all_by_myself_ ;) ) There's a bug in the linux kernel for amd64 builds only that breaks some skiplist files. Specifically, checkpointing a seen file with a long (greater than page size) list of seen data will cause corruption where it crosses the page break. The last 16-24 bytes will of the page will be NULLed out. You can read more about it in all its gory detail here: http://lkml.org/lkml/2008/6/17/9 Thanks Linus for the prompt (at least partial) fix. If you are running one of those kernels now, I recommend you either change the kernel version, or apply the patch Linus posted. I was going to suggest a little "magic" patch, but I've been unable to actually make it work in testing, so I won't do it! Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Linux kernel bug AMD64 - affects skiplists
On Wed, 18 Jun 2008 23:04:49 +0200, "<::Teresa_II::>" <[EMAIL PROTECTED]> said: > У ср, 2008-06-18 у 14:00 +1000, Bron Gondwana > пише: > > I promised I'd have something to say about skiplists soon! > > My cyrus runs on amd64 too, so does > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=42a886af728c089df8da1b0017b0e7e6c81b5335 > > fix the problem ? Yes, it does. I haven't rolled it out to any production machines yet (just reverted back to the 2.6.20 series kernel that we were using before) - but I built a test kernel with it and confirmed the fix. Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Skiplist errors on Cyrus 2.3.12
On Fri, Jul 11, 2008 at 11:37:52AM +0200, Reinhard Zierke wrote: > Hi, > > Since I upgraded my Solaris 10 Postfix/Cyrus mail server to Cyrus 2.3.12, > I habe some problems when removing mailboxes. I try to delete > several users' mailboxes within one call of cyradm or in a homemade Perl > script that uses Cyrus::IMAP::Admin. The first delete in one command > invocation always work fine, but from the second user on I get a terse > error message "cyrus: c:" and the syslog shows something like: > > Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 637875 > local6.error] Fatal error: Internal error: assertion failed: > cyrusdb_skiplist.c: 622: db->lock_status == UNLOCKED > Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 558109 > local6.error] skiplist: closed while still locked > > Also, I see tons of skiplist 'already open' messages in my syslog like: > > Jul 11 09:59:33 mailhost.informatik.uni-hamburg.de imaps[9792]: [ID 412576 > local6.notice] skiplist: /var/imap/user/z/zierke.seen is already open 1 time, > returning object > > What's wrong? And can I safely go back to Cyrus 2.3.11 binaries without > botching up my Cyrus databases? That would be the skiplist sanity checks finding a latent bug in another part of Cyrus. Are you able to send me the exact sequence of commands your script runs? Is there anything else between the deletes? I'll go have a look at the source code. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Skiplist errors on Cyrus 2.3.12
On Fri, Jul 11, 2008 at 11:37:52AM +0200, Reinhard Zierke wrote: > Hi, > > Since I upgraded my Solaris 10 Postfix/Cyrus mail server to Cyrus 2.3.12, > I habe some problems when removing mailboxes. I try to delete > several users' mailboxes within one call of cyradm or in a homemade Perl > script that uses Cyrus::IMAP::Admin. The first delete in one command > invocation always work fine, but from the second user on I get a terse > error message "cyrus: c:" and the syslog shows something like: > > Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 637875 > local6.error] Fatal error: Internal error: assertion failed: > cyrusdb_skiplist.c: 622: db->lock_status == UNLOCKED > Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 558109 > local6.error] skiplist: closed while still locked > > Also, I see tons of skiplist 'already open' messages in my syslog like: > > Jul 11 09:59:33 mailhost.informatik.uni-hamburg.de imaps[9792]: [ID 412576 > local6.notice] skiplist: /var/imap/user/z/zierke.seen is already open 1 time, > returning object > > What's wrong? And can I safely go back to Cyrus 2.3.11 binaries without > botching up my Cyrus databases? Oh yeah, a copy of your imapd.conf and whether you apply any patches would be nice to know too. Handy for reconstructing the problem! I apply two patches to skiplists on 2.3.12 (yeah, I know, after all the stuff I've had pushed upstream too): hmm... and I realise I haven't updated the website for a while. Doing that now... ok: http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-safeunlock-2.3.12.diff http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-readlocktracking-2.3.12.diff I'd be interested to know if the issue still exists with these. They tidy up the logic for locks even more. I needed it to make the fast_rename and folder_limit stuff work again. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Auto-deletion of messages in Junk-folder after a certain time
On Mon, Jul 14, 2008 at 01:54:01PM +0200, Marten Lehmann wrote: > Hello, > > we have a virtual domain configuration and I want to remove all messages > within the folder > > user/@/Junk/* Being the filty perl programmer that I am, I would just make an admin IMAP connection to the server, LIST all mailboxes, regex match the ones I wanted, select them and process them. > I don't want to mark old messages as deleted and expunge them, because > then maybe I'm expunging messages, that haven't been flagged as deleted > by me but the owner of the mailbox and aren't ment to be expunged at > this moment. We do this by setting our own special flag (in addition to the regular \Deleted flag) and then SEARCH for those messages only and UIDEXPUNGE them. But if you're deleting ALL messages, then it doesn't really matter does it? Unless you're doing some sort of age based thing, in which case like I said - UIDEXPUNGE. The flag just lets us persist the action across dropped connections. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: patches since 2.3.12-p2?
On Fri, Jul 18, 2008 at 11:57:28AM +0200, Per olof Ljungmark wrote: > and > skiplist: unlock while not locked This is almost certainly a bug. I added this along with a bunch of other skiplist changes to find places where the database interface wasn't being used correctly, because it means bugs of some sort. There's another skiplist bug I've been trying to track down (multiple deletes on the same connection failing), but haven't been able to reproduce it yet. Unfortunately, the cyrus database interface sort of sucks from a consistency perspective, it's dangerous to call any function that needs database access if you have a database transaction open, because the code doesn't know about the transaction and blindy goes ahead and starts a new transaction, which doesn't work. The code now throws an error immediately rather than causing corruption. Much better :) Bron ( ok, so far I've only seen this happen in my own bogus patches, but it's still better to be safe! ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: patches since 2.3.12-p2?
On Mon, Jul 21, 2008 at 12:17:14PM +0100, Ian G Batten wrote: > > On 19 Jul 08, at 1313, Bron Gondwana wrote: >> >> Unfortunately, the cyrus database interface sort of sucks from >> a consistency perspective, it's dangerous to call any function >> that needs database access if you have a database transaction >> open > > I understand some of the technical, philosophical and historical reasons > why this isn't the case, but every now and again I find myself wishing > that Cyrus had an SQL backend for the various databases (perhaps not > delivery, because losing it isn't the end of the world, but certainly for > mailboxes). > > In our case, we have really big Oracle and Postgres systems that could > proably handle the load imposed by out mailsystem metadata as well as > our mailsystem copes with it itself via skiplist, but we would could > then manage those databases with the same tools we use for the > production systems (hot backups, replication, etc). > > Losing the mailboxes database can spoil your whole day, and the lengths > we go to to keep it safe (snapshots of the filesystem, hourly runs of > ctl_mboxlist -d, etc, etc should really be necessary if it were in a > production SQL database. > > In my copious spare time, I might take a pass at the cope and see how > hard it looks. Muahahahahaha. Erhum. Actually, the interface itself isn't that bad. Managing transactions might give you headaches though. And connections would probably be per-imapd process, so be prepared to have 4k connections sitting mostly idle or lots of startup/shutdown of connections. Bron ( not having done any real C library SQL coding myself, I'd suggest probably some sort of generic DBI-style layer than a single database at a time if you go this route ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: help with cyr_expire please
On Sun, Jul 27, 2008 at 11:18:43AM +0200, Per olof Ljungmark wrote: > It seems I do not fully understand how cyr_expire works. Some background: > > I've set up a testing server with 2.3.12-p2 with delayed expunge and > delayed delete. In cyrus.conf: > delpruneandexpunge cmd="cyr_expire -D 7 -E 3 -X 7" at=0400 > > and it all worked as expected. > > Now I'm ready to implement this on our production (2.3.12-p2) machine, > added the proper statements to imapd.conf > expunge_mode: delayed > delete_mode: delayed > deletedprefix: DELETED > > Then i tested the following command: > su cyrus -c "/usr/local/cyrus/bin/cyr_expire -D 5 -E 5 -X 5 -p > user.spamdump -v" > > and to my horror it did not only delete expunged messages but a fair > share of messages without any flags set: > expunged 677607 out of 682456 messages from 30 mailboxes Did you check if any "live" messages were actually deleted, or was it just expunged messages cleaned up? I can imagine that being the case for a spamdump user that you clean up frequently. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: even more questions on replication and expire
On Tue, Jul 29, 2008 at 12:16:01PM +0200, Rudy Gevaert wrote: > Per olof Ljungmark wrote: > > In the course of setting up delayed expunge on our production server I > > came across the following; > > > > - With delayed_expunge on the master, messages that are expunged by a > > user will be retained -X days on the master but immideately deleted on > > the replica unless it has delayed_expunge too. > > > > So if I implement delayed_expunge on the replica, do I need cyr_expire > > to permanently remove messages after -X days or will sync_client do > > that? > > yes That's "yes" to "you need to run cyr_expire on the replica too". Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: even more questions on replication and expire
On Tue, Jul 29, 2008 at 05:09:05PM +0200, Per olof Ljungmark wrote: > Bron Gondwana wrote: >> On Tue, Jul 29, 2008 at 12:16:01PM +0200, Rudy Gevaert wrote: >>> Per olof Ljungmark wrote: >>>> In the course of setting up delayed expunge on our production >>>> server I came across the following; >>>> >>>> - With delayed_expunge on the master, messages that are expunged by >>>> a user will be retained -X days on the master but immideately >>>> deleted on the replica unless it has delayed_expunge too. >>>> >>>> So if I implement delayed_expunge on the replica, do I need >>>> cyr_expire to permanently remove messages after -X days or will >>>> sync_client do that? >>> yes >> >> That's "yes" to "you need to run cyr_expire on the replica too". > > Thanks for the info. I can't help wonder if this was a firm design > decision? From a user perspective it should be easier if this followed > the synchronization I believe. > > Anyway, thanks, that was the last piece needed to finish off. I would much prefer that it was done via synchronisation as well. It's a pain from a consistency point of view. But there you go. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus vs Dovecot
On Wed, Aug 13, 2008 at 01:07:34PM -0400, Wesley Craig wrote: > On 13 Aug 2008, at 10:31, kbajwa wrote: > > I think you are missing a point which is most important, i.e., what > > type of > > support Cyrus vs Dovecot offers. In my experience: > > > > Cyrus = 0 > > Dovecot= 100 > > As someone who answers many help requests for cyrus (and I'm very far > from the only one), I can honestly say I've never seen a requests > from you. Perhaps you've had a lot of occasion to ask for help with > Dovecot. I'm happy to hear you've gotten that help. Community is a > lot of what open source software is about. As for your experience > with the cyrus imapd community, perhaps your sample size is too small. Yeah, there are a few of us here answering help requests, and even helping debugging in some cases. I'd be interested to see where that '0' comes from too. Still, I think Cyrus and Dovecot are the best two imap servers out there, so it's going to be a question of which integrates best with your usage pattern. For a small server, starting with no experience in either, I would probably choose Dovecot. Now that I know Cyrus inside out, back to front, warts and all - well, I'd choose Cyrus because I know how to make it play nice. It's more of a "total system" in itself though, that you write support stuff around. Dovecot integrates more with other tools in a unix-daemon'y way. Enjoy, Bron ( now if someone came along with a compelling competitior for SASL... ) Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Deleting messages "marked for deletion" older than X days
On Mon, Aug 18, 2008 at 05:09:53PM -0500, Kenneth Marshall wrote: > In the manual page, the definition of the '-X' option seems to > do what you want: > > -X expunge-days > Expunge previously deleted messages older than expunge-days > (when using the "delayed" expunge mode). The default is > 0 (zero) days, which will expunge all previously deleted > messages. Messages go through the following life cycle in "traditional IMAP", example: Created (\Recent)- LMTP DELIVER (UID = 9) No flags () - A001 SELECT INBOX (clears the \Recent) Viewed (\Seen) - A002 UID FETCH 9 RFC822 Deleted (\Deleted \Seen) - A003 UID STORE 9 +FLAGS (\Deleted) Purged (no message) - A004 EXPUNGE Now, what the -X option actually does is turns this into: Created (\Recent)- LMTP DELIVER (UID = 9) No flags () - A001 SELECT INBOX (clears the \Recent) Viewed (\Seen) - A002 UID FETCH 9 RFC822 Deleted (\Deleted \Seen) - A003 UID STORE 9 +FLAGS (\Deleted) Purged (no index record)- A004 EXPUNGE but the file is still on disk, just the index record has been moved from the file cyrus.index to a new file cyrus.expunge. A week later: Cleaned up (no file) - cyr_expire -X 7 The cyrus.expunge record and the actual spool file itself get deleted at this point. Until then you can un-delete the record using the "unexpunge" command in cyrus 2.3.X. --- I think what the original requestor was actually looking for is a tool that can run the "EXPUNGE" phase on a regular basis. As far as I'm aware there's nothing that ships with Cyrus that can do it. If I was writing something for the job I would make an admin IMAP connection to Cyrus and just cycle through the folders calling 'EXPUNGE' on them. Cheap and nasty, but it would do the trick. You can do this in any language with a TCP library, though something with an IMAP interface library would be nicer. I'd use Perl and Mail::IMAPTalk, but that's just because that's what I already use! Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Is unixhierarchysep:1 required when using virtdomain?
On Tue, Aug 26, 2008 at 03:11:41PM +0200, tarjei wrote: > I read somewhere that setting unixhierarchysep to true is required when > using virtdomain, but this is not mentioned on the man page. Wow, you could have fooled me. > Is there something missing on the manpage, or have I just missunderstood > something? We have a few hundred thousand users who are domain split, and we ddon't use unixhierarchysep. > Also, what problems will I face if I set it to false? One thing is > client issues, but what about acls, etc? Changing it on the fly sounds messy. Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Is unixhierarchysep:1 required when using virtdomain?
On Wed, 27 Aug 2008 14:36:58 +0200, "Rudy Gevaert" <[EMAIL PROTECTED]> said: > Bron Gondwana wrote: > > On Tue, Aug 26, 2008 at 03:11:41PM +0200, tarjei wrote: > >> I read somewhere that setting unixhierarchysep to true is required when > >> using virtdomain, but this is not mentioned on the man page. > > > > Wow, you could have fooled me. > > > >> Is there something missing on the manpage, or have I just missunderstood > >> something? > > > > We have a few hundred thousand users who are domain split, and > > we ddon't use unixhierarchysep. > > But then you don't have '.' in their user names, right? Nope. Never wanted a dot in a username. Usernames belong in [a-z][a-z0-9_]+ space. And strictly, trailing _ is pretty bogus too. :) Bron. -- Bron Gondwana [EMAIL PROTECTED] Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Very annoying IMAP problem (cyrus + Outlook)
On Mon, Sep 01, 2008 at 03:37:19PM -0400, Wesley Craig wrote: > On 01 Sep 2008, at 14:50, Denis BUCHER wrote: > > What should I do next to solve my problem ? > > There are actually a couple of places cyrus might give the fatal > error "word too long". The prot routines should be recording the > interactions before passing the data up to the imap layer where the > parsing error occurs. Are there any long lines in your telemetry? > The bad line should directly proceed "* BYE Fatal error: word too > long" in the telemetry (if I'm reading imapd's fatal() routine > correctly). > > AFAIK, there's nothing to be done other than adjusting the MAXWORD > and/or MAXQUOTED limits. That means upgrading or recompiling the old > version that you're on. This is what we apply at FastMail: -MAXQUOTED = 32768, -MAXWORD = 32768, +MAXQUOTED = 524288, +MAXWORD = 524288, Our smallest IMAP server has 6Gb of memory, so we really don't need baby-sized buffers :) Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Replication errors: missing subscription
Our Cyrus 2.3.12 + patches replication system has been running very reliably for months - to the point where the only issues our checkreplication script tends to find are either: a) cases where someone has reconstructed and not run quota -f afterwards, causing quota mismatches. (this is mostly the fault of bits of our code that need updating!) b) subscriptions missing on the replica. I have a suspicion that most of these could be avoided by the simple expedient of switching from putting individual subscription records into the sync log to copying the entire user.sub file. (I've also changed setseen_all to just overwrite the user.seen file rather than attempt some sort of merger. It's a replica, the master is right! This will break if you're using a different database type on the replica than the master of course - but that's why you shouldn't be sending binary formats over the wire in the first place. It's already going to break) Bron. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Skiplist errors on Cyrus 2.3.12
On Fri, Jul 11, 2008 at 11:37:52AM +0200, Reinhard Zierke wrote: > Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 637875 > local6.error] Fatal error: Internal error: assertion failed: > cyrusdb_skiplist.c: 622: db->lock_status == UNLOCKED > Jul 11 10:32:41 mailhost.informatik.uni-hamburg.de cyradm[13944]: [ID 558109 > local6.error] skiplist: closed while still locked We think we've figured this one out now :) Finally. John Capo came up with a basic patch that fixed it, and I've done a slightly more ambitious refactor. Rudy has tested my patch, and we're running it at FastMail as well. I've rebuild our webpage with the new patch included. NOTE: this patch obsoletes the old readlocktracking patch, and conflicts with it. This way is much cleaner. Bron. http://cyrus.brong.fastmail.fm/patches/cyrus-skiplist-locking-rework-2.3.12.diff Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html