Re: LARGE single-system Cyrus installs?
On Sun, 11 Nov 2007, Bron Gondwana wrote: >> 250,000 mailboxes, 1,000 concurrent users, 60 million emails, 500k >> deliveries/day. For us, backups are the worst thing, followed by >> reiserfs's use of BLK, followed by the need to use a ton of disks to >> keep up with the i/o. > > For us backups are hardly a blip on the radar :) The joy of writing > your own custom backup system that knows more about Cyrus internals than > just about anything else. It starts with some stat calls, and if any of > the cyrus.header, cyrus.index or cyrus.expunge files have changed then > it will lock them all then stream them all to the backup server. Cyrus is pretty ideal for fast incremental updates to a backup system: hence replication. You shouldn't need to lock anything with delayed expunge, delayed delete and fast rename in place. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication: sync_client -r dies
On Mon, 12 Nov 2007, Bron Gondwana wrote: >> It seems to me that the replication code ought to be a bit more robust >> than this when a replica goes down or loses network connectivity. Is >> the 2.3.10 code any better than 2.3.9 in the way this kind of situation >> is handled? > > I believe David Carter has been working on some stuff for this which is > lined up to go in soon. The autorestart stuff is already in 2.3.10. It was Ken's work, based on a suggestion on my part. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication: does it work in both directions?
On Sun, 11 Nov 2007, Rich Wales wrote: > So, I would have replication set up going both directions between my two > servers, but the sets of users handled in each direction would be > disjoint. Each user would be assigned to one IMAP server (the master > for their mailbox collection), and the other server would be their > replica and act as their backup. We do this. It is quite useful to be able to bounce users back and forth between the two machines in a pair so that servers can be maintained (patches, O/S upgrades, whatever) without any user visible downtime. Three caveats: 1) It won't work with shared mailboxes. 2) I'm not running the same replication code as the rest of you (though replication in 2.3 is based on an old version of my code). I seem to remember Ken raising an objection when this last discussed a year or two back now. The objection may just have just been (1). 3) Sanity checks are good: USER dpc22 NO IMAP_INVALID_USER Attempt to update master for dpc22 -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Multiple skiplist bugs found, patches attached
On Tue, 13 Nov 2007, Simon Matter wrote: > I didn't have much troubles with skiplist over the years and it has been > a blessing since moving away from BDB. But I did have a few issues with > broken skiplist files so your patches are very welcome. I have included > the patches in my private rpm packages to try how they work. Do you > recommend both for general consumption? It is certainly very easy to break mailboxes.db using cyr_dbtool. Kudos to Bron for tracking down the problems. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Just in case it is of general interest: ZFS mirroring was the culprit in our case
On Tue, 13 Nov 2007, Pascal Gienger wrote: > Our latency problems went away like a miracle when we detached one half > of the mirror (so it is no more a mirror). > > Read-Rates are doubled (not per device, the total read rate!), latency > is cut off. No more latency problems. > > When attaching the volume again, resilvering puts the system to a halt - > reads and writes do block for seconds (!). Definitely of interest to those of us keeping one eye on ZFS. Thanks. Can someone else running ZFS confirm this behaviour? -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Tue, 13 Nov 2007, Bron Gondwana wrote: > If you're planning to lift a consistent copy of a .index file, you need > to lock it for the duration of reading it (read lock at least). mailbox_lock_index() blocks flag updates (but this doesn't seem to be something that imapd worries about when FETCHing data). You don't need to worry about expunge or append events once the mailbox is open. > But since I would like a consistent snapshot of the mailbox state, I > lock the cyrus.header and then the cyrus.index and then (if it's there) > the cyrus.expunge. That means no sneaky process could (for example) > delete the mailbox and create another one with the same name while I was > busy downloading the last file - giving me totally bogus data. chdir() into the mailbox data directory: with delayed delete and fast rename it shouldn't matter if the mailbox is replaced under your feet. That's the way replication worked on my 2.1 systems, prior to split-meta. (Locking isn't a big deal, but safe concurrent access is always nice). -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Deleting top-level mailbox with 'delete_mode: delayed'
On Tue, 13 Nov 2007, Bron Gondwana wrote: > I have "delete_mode: immediate" on the replica and "delete_mode: > delayed" on the master. sync_server doesn't pay any attention to delete_mode, so the option shouldn't have any effect on the replica. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication error
On Wed, 5 Dec 2007, Gabor Gombas wrote: > Hmm, can a regular user rename his own INBOX? I'm pretty sure no admin > actions were performed. RFC 3501, section 6.3.5: Renaming INBOX is permitted, and has special behavior. It moves all messages in INBOX to a new mailbox with the given name, leaving INBOX empty. If the server implementation supports inferior hierarchical names of INBOX, these are unaffected by a rename of INBOX. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication error
On Tue, 4 Dec 2007, Wesley Craig wrote: > The internal Cyrus "mailbox ID" ought to be unique, but it's not. On > the sub folder, remove the cyrus.header file and reconstruct. This will > assign a new, unique mailbox ID. Any ideal how they ended up with the > same IDs? Given that a user inbox is involved, my guess would be: Changes to the Cyrus IMAP Server since 2.3.9 [...] * Fixed the special case of RENAMEing an Inbox, so that it doesn't keep the same mailbox uniqueid, thus allowing it to replicate properly (seen state is still preserved). -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: skiplist_unsafe
On Fri, 7 Dec 2007, Janne Peltonen wrote: >> If you feel that your filesystem/buffercache will do a good job at >> writing things out to disk, and you've got battery-backed cache on >> your storage, you should be relatively well off. > > But if I were to turn skiplist_unsafe on, and the OS crashed - or, say, > the cluster system forcibly unmounted my Cyrus spool and config > filesystems - wouldn't that result in horribly unrecoverable databases > all over the place? (I have everything in skiplist, except quota and > subscriptions.) It is easy enough to find out. Take an fsync() test rig such as Brad Fitzpatrick's diskchecker.pl and comment out the fsync()s. If the disk checker moans, then updates have been lost in buffer cache. Under Linux this is only safe if the filesystem is mounted with the "sync" option, even with data=journal. Part of the point of fsync() is to make sure updates hit nonvolatile storage in the correct order. A specific example: skiplist commit records are written after an fsync(), immediately followed by another fsync() before the write lock is released. If writes get reordered before they hit disk, then there is a good chance that the database will become corrupt. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire and messages where Date: is in the future?
On Wed, 9 Jan 2008, Mike Eggleston wrote: > I tried and experiment that worked using 'ipurge -d 1 -X > user.$user.spam'. The delete flags were set properly. I guess I still > need to run a cyr_expire to expunge the messages? > > I want to \Delete all messages in user.*.spam and user.*.backup. Can I > use those patterns with ipurge? "ipurge -f -d 1 -X user/*/spam" works for me (2.3cvs). The messages get expunged and expired immediately, which is what I would expect from: mailbox_expunge(&the_box, purge_check, &stats, EXPUNGE_FORCE); cyr_expire is normally used to expire expunged messages if you are running with delayed expunge. ipurge appears to bypass this. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire and messages where Date: is in the future?
On Wed, 9 Jan 2008, David Carter wrote: > "ipurge -f -d 1 -X user/*/spam" works for me. "user/%/spam" if I didn't want to match user/dpc22/foo/bar/spam -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire and messages where Date: is in the future?
On Wed, 9 Jan 2008, Mike Eggleston wrote: > Using spam assassin and sieve I have messages flagged as spam filed into > user.*.spam folders. I also have a 'cyr_expire -E 1 -X 1' job running > each morning and a 'cyradm mboxcfg expire user.*.spam expire 2' set on all > spam folders. One of the oddities I'm seeing is where a spam message has a > header of 'Date: Fri, 28 Mar 2008 00:16:17 -0800'. These messages are not > expiring even though the messages may physically be months or a year old. I believe that ipurge is the problem here, not cyr_expire. ipurge uses the Date: header unless you use the -X flag. Consequently a message with: Date: Fri, 28 Mar 2008 00:16:17 -0800 wouldn't be expunged until March. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire and messages where Date: is in the future?
On Wed, 9 Jan 2008, Mike Eggleston wrote: > Ok so 'user/%/spam' only matches user/$user/spam. > Using 'user/*/spam' matches user/$user/.../spam, right? > Must I use '/' in the pattern or can/do I use '.'? I use unixhiersep, hence '/'. If you don't, use '.' > Is there some undocumented feature like a '-n' to ipurge so I can test > what this command does without erasing all my user's messages in the > 'test'? Try it on a test system first :). ipurge gives a fairly decent running commentary about the mailboxes it is processing. You could always take the source code and comment out the mailbox_expunge() if you want to test on a live system. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: mbexamine blocks mailbox
On Mon, 21 Jan 2008, Michael Menge wrote: > i used mbexamine user/testuser | less > to check the check the mailbox of testuser. I forgot to quit less over > the weekend and today i discoverd two problems. > > 1. The mailbox was blocked. No new mails were deliverd to this mailbox. The mailbox was left in a locked state. (fcntl() or flock() locks on both cyrus.header and cyrus.index). > 2. After quitting less, the new mails had been delivered multiple > times depending on the retries of lmtpd. I imagine that you had a whole stack of lmtp processes waiting to acquire the lock on the mailbox. I'm a little suprised that something didn't time out, but I guess that the entire message would be sitting in the staging directory before lmtpd attempts to lock the target mailbox. > is this a (known) bug? Are there other cyrus tools which will have the > same effect? unexpunge -l ? Any command line tool which locks a mailbox and then generates output will have the same behaviour. unexpunge would be the obvious example. Whether this is a bug is debatable. Neither mbexamine or unexpunge actually needs to lock the mailbox when they are just dumping mailbox state, but locking is normally the safest course of action. They aren't supposed to be long running processes. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire -E ?
On Fri, 18 Apr 2008, David R Bosso wrote: > I don't specify a -X, I just want to prune the duplicate db. What am I > doing wrong? -X expunge-days Expunge previously deleted messages older than expunge-days (when using the "delayed" expunge mode). The default is 0 (zero) days, which will expunge all previously deleted messages. Try -X . cyr_expire is a bit overloaded. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Hyphens in folder names break LIST
On Tue, 20 May 2008, Matthew Hodgson wrote: > If I create a hierarchy of folders such as: > > test > test.SPAM > test-foo > > and try to list the folder hierarchy with something like: > > 11 LIST "" "test%" > > I get broken output, where test is listed twice - the second time with a > \Noselect flag: The problem is that '-' sorts before '.' in ASCII. Try: improved_mboxlist_sort: 1 (You will need to dump and then restore the mboxlist). -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication verification
On Fri, 27 Jun 2008, Vladimir Klejch wrote: > I searchig for a posibility to use both server's in production as > master-master. Afraid that replication in Cyrus doesn't support full master-master, only master/slave. UIDs in IMAP make full master-master rather involved. It is possible to run a mix of master and replica mailstores on a single system. > There are tools like nake_md5 and make_sha1, but the manpages document > only howto config them, but not how to realy use them for replication > check. I download the md5 files to a single location and run a 50 line Perl script to spot mismatches. You are welcome to a copy of that script. To make sure that the replica is up to date I run sync_client in an extra verbose mode (-v -v) and check for unexpected updates. Unfortunately that code didn't make it it into the vanilla Cyrus tree because of the reorganisation required to run sync_server from master using prot streams for communication. It wouldn't take a huge amount of effort to add "-v -v" into standard Cyrus. I believe that Fastmail have an external test suite which does spot checks on the master and replica versions of each account. This is the opposite approach, and makes sense if you have a convenient IMAP client library. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Very annoying IMAP problem (cyrus + Outlook)
On Fri, 22 Aug 2008, Denis BUCHER wrote: > As far as I understand, the cause of the problem is : * "Outlook is > sending either a token or quoted string that is longer than 8K bytes." Your problem is a mailbox which contains several thousand messages. Possibly several thousand messages which Outlook hasn't seen previously. Outlook tries to construct a single IMAP command of the form: UID FETCH uid,uid,uid,... where the list of UIDs is larger than 8 KBytes in size. > But how could I correct this problem either in Outlook or cyrus ??? Split the problem mailbox into smaller mailboxes using something other than Outlook. Alternatively you could increase the word limit in Cyrus. MAXWORD is 32k in recent Cyrus 2.3 versions. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Very annoying IMAP problem (cyrus + Outlook)
On Fri, 22 Aug 2008, Denis BUCHER wrote: > But I think I understand, I have to create the "log" directory into : > /var/spool/imap/user/dbucher > which means : > /var/spool/imap/user/dbucher/log No, /var/imap/log/dbucher. You shouldn't need to restart Cyrus. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Very annoying IMAP problem (cyrus + Outlook)
On Fri, 22 Aug 2008, Ciprian Marius Vizitiu wrote: > Any other possibility? I mean other than several thousand messages? Several thousand messages is the obvious cause. We've seen this once (with Cyrus 2.3) after running unexpunge on a very large mailbox. The mailbox got a new UIDvalidity, so Outlook wanted to resychronise the whole thing. But /var/imap/log will give a definitive answer. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyr_expire signaled to death by 11
On Thu, 25 Sep 2008, Per olof Ljungmark wrote: > No idea why, comments anyone? A limitation of some sort when expunging a > lot of messages? > > cyr_expire [96599]: Expunged 3005 messages from user.myuser.Trash > cyr_expire[96599]: expunged 4907 out of 109443 messages from 143 mailboxes > master[833]: process 96599 exited, signaled to death by 11 Probably a corrupt cyrus.cache file (at least that's the cause when I see these). Try reconstruct on the mailbox. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus replication : master to replica and replica to master
On Fri, 23 Oct 2009, Bron Gondwana wrote: > I've seen heartbeat get split brain before. We gave up on it. We do > all our fencing via humans now! Check the KVM, kick the box, manually > run the failover script. Some of my colleagues have had a lot of grief with Heartbeat going split brain. It seems to really be designed for a pair of machines sitting next to each other in a rack with a serial link for the heartbeat, rather servers installed in a pair of machine rooms three miles apart. We do manual failover with our Cyrus mailstores: I would rather 1/8th of my users had an outage of a couple of hours (and typically just a few minutes) than end up with a split brain. On the one occasion in five years that we did end up with a Cyrus split brain (replication failed because of a memory DIMM error and then the entire master failed a few minutes later) it was easy enough to fish missing messages out of the dead system the following day and reinject them using LMTP. Certainly easier than reengineering the entire Cyrus mailstore to allow active/active replication. On Wed, Oct 21, 2009 at 08:45:11PM +0200, David Touzeau wrote: > I would like to know if it is possible to SET the replica has the master > too in order to replicate new mail saved on the replica to the master > and vis versa In this case it should be turn to active/active.. We do this to a limited degree: the set of active users on a pair of mailstores can be partitioned and bounced back and forth between the two servers in a pair. This is mostly useful for load balancing between our two machine rooms, or migrating all the users off a master so that we can patch and reboot without any user visible downtime. However this is using my own replication code rather than the branch which was rewritten into Cyrus by Ken. I have additional safeguards to stop sync_client from overwriting the master data in a pair (which has only ever happened because of stupidity on my part when testing). I've never used the standard replication code in Cyrus other than to backport (sideport?) additional features such as CONDSTORE and GUID support. Given the grief Fastmail had with the early Cyrus replication code I think that I'm rather glad about this. Every once in a while I think about moving to standard Cyrus replication. Unfortunately there are a lot of warts that I really don't like. It is much easier to just drop my own replication code onto new versions of Cyrus (typically < 5 minutes work each time). That was one of my original design objectives. -- David Carter Email: david.car...@ucs.cam.ac.uk University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: painful mupdate syncs between front-ends and database server
On Fri, 30 Oct 2009, Michael Bacon wrote: > On all systems in the murder, we'll see instances where the mupdate > process goes into a spin where, in truss, it's an endless repeat of > fcntl, stat, fstat, fcntl, thousands of times over. These execute > extremely quickly, but I do wonder if we're assuming that something that > takes very little time takes an insignificant amount of time, when the > time involved becomes significant on an 800k mailboxes database. I agree that latency is probably your problem here. I'm wondering if fsync() latency on the frontends might be a factor given that you report little disk I/O on the mupdate master (IOPS are much more important than Kps, but I'm sure that you already know that). The update process will only be as fast as its weakest link, and you stated earlier: > When we spec'ed out our servers, we didn't put much I/O capacity into > the front-end servers -- just a pair of mirrored 10k disks doing the OS, > the logging, the mailboxes.db, and all the webmail action going on in > another solaris zone on the same hardware. No mention of battery backed write cache there, which tends to be fairly critical for anything involving fsync(). There is an easy way to find out: skiplist_unsafe: 0 If enabled, this option forces the skiplist cyrusdb backend to not sync writes to the disk. Enabling this option is NOT RECOMMENDED. You can ignore the scary warning (at least for test purposes) on murder frontends, given that it is just a readonly replica of the mupdate master. I hope that this isn't a complete red herring. It just struck me that it would be a really easy test to make. -- David Carter Email: david.car...@ucs.cam.ac.uk University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: weirdness after reconstructing
On Fri, 16 Apr 2004, Robin M. wrote: > > In addition to this, I noticed that now everything is unread, which I can > > deal with but if I mark it as read, close the email client and come back > > its marked as unread again. > > > Hi I am curios can you explain your mails being flagged as new more. The original report sounds like database corruption or a mismatch between the configured seen database type and what was on disk. > I am having the same problem, but only with squirrelmail and Outlook > express, Outlook and thunderbird are fine. I only notice this happen > when I compose a message or reply to a message. I don't know about squirrelmail, but the problem with Outlook Express is that it makes two concurrent IMAP connections to a given mailbox. One of these IMAP sessions is used to generate lists of messages within a mailbox while the other is used to fetch individual messages. I infer the idea is to improve concurrency on slow dialup links (OEs intended audience), in case the message IMAP stream gets blocked up downloading a big message. The problem is that the IMAP session which fetches messages updates the Cyrus seen database, but these changes aren't picked up immediately by the other IMAP session. The Cyrus imapd only calls index_check() on certain IMAP operations (it follows the IMAP specification to the letter!). A patch is floating around which causes imapd to call index_check() more frequently (specifically on all IMAP FETCH and STORE operations): http://asg.web.cmu.edu/archive/message.php? mailbox=archive.info-cyrus&msg=19703 I used this patch for a number of months and it certainly fixes the problem with OE. It does involve a bit more work for the IMAP server, but I can't say that I can see any measurable difference on our systems. The only problem with this patch is that it causes unsolicited EXPUNGE events to (non UID) FETCH and STORE operations, which isn't allowed. This breaks legitimate concurrent access to mail folders. I have an updated patch which fixes this problem if anyone is interested. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: weirdness after reconstructing
On Sun, 18 Apr 2004, Craig Ringer wrote: > On Sun, 2004-04-18 at 00:03, Henrique de Moraes Holschuh wrote: > > > Please post the updated patch somewhere, these things are always useful to > > have around. http://www-uxsup.csx.cam.ac.uk/~dpc22/cyrus/patches/ That's against 2.1.16. If someone would like to generate the equivalent 2.2.X patch I'm happy to put it up on the same page. > If you could add a link here: > http://acs-wiki.andrew.cmu.edu/twiki/bin/view/Cyrus/ExternalLinks > it'd be much appreciated. Done. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sieve parse error
On Tue, 11 May 2004, Andrew Morgan wrote: > However, the interesting thing to note is that all sieve operations for > this process after this error message generate bogus parse errors and > email is not processed through sieve. Here are the syslog messages for > that particular lmtpd process: Its a small bug in the Sieve parser. A patch is available at: http://www-uxsup.csx.cam.ac.uk/~dpc22/ cyrus/patches/2.1.16/sieve-parse-bug.patch -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus hanging (possible saslauthd problem)
On Wed, 26 May 2004, Colin Bruce wrote: > I can login and read e-mail quite happily most of the time. However, > sometimes it accepts my username and password and then says "opening > inbox" for ever. Usually, while this is in progress a message will flash > up about an untagged response. However, it is not visible long enough to > see what the message is about. In any event it will never open the > inbox. While I don't think that it's directly relevant to the problem that you are having here, I have seen this before with Linux servers. If you restart syslog on a Redhat Linux box existing processes which have already opened syslog break down. Messages which would have gone to the syslog go to stderr (if I remember correctly), which happens to be the socket attached to IMAP client. Much hilarity ensues. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: High availability ... again
On Tue, 22 Jun 2004, Etienne Goyer wrote: Does somebody on the list use this solution or a similar one and could comment and the practicality of it ? Perhap M. Carter (if you read the list) could give us a status update for his particuliar project ? There's really not a whole lot to say. We've been using the code on our main 32k user mail system since about this time last year for data migration, fast incremental backup to a tape spooling system, and rolling replication for live updates. We also used the replication system to migrate from a UW based system to Cyrus. We have 16 small Linux servers running as 8 pairs. All the systems are live Cyrus servers, half the accounts on each system are replica versions. One of the 16 had a hardware fault a couple of weeks back and noone has moaned at me after we switched to the replica which is always a good sign. From my perspective the advantage of application level replication over block level replication like DRDB is flexibility. Read/write access to both master and replica systems can be useful: we maintain databases of MD5 checksums for all the messages and cache entries on each server. Its also rather cute to run PINE against both master and replica version of a given mailbox and watch the replica play follow my leader :). -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: High availability ... again
On Wed, 23 Jun 2004, Kevin Baker wrote: Is it something like this: - Server A - active accounts 1-100 - replicate accounts 101-200 from Server B - Server B - active accounts 101-200 - replicate accounts 1-100 from Server A If B goes down, A takes over the accounts it had replicated from B. Yes, precisely that. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Paying for developers?
On Tue, 14 Sep 2004, Attila Nagy wrote: http://www-uxsup.csx.cam.ac.uk/~dpc22/cyrus/replication.html There are some problems with that: - the code isn't available on that webpage No, but the code is available to people who want to play with it on the understanding that they get no sympathy from me if they try and run it on a production server right now. - it changes the mailstore layout, so you cut off yourself if you use that instead of the mainstream version The incompatible change is actually just a single 96 bit value per message in the cyrus index file (a message UUID value, used to replicate the single instance store). If a future UUID format was agreed and space was reserved in current index files, the incompatibility would disappear. That might be a path to more widespread testing. - I guess it is for an older Cyrus, so you cannot easily upgrade I passed a patch relative to 2.3 CVS on to Rob a few months back. The replication code is largely orthogonal to the existing code: it only took me a couple of hours to generate the patch from my existing 2.1.16 code. I cannot say anything about its architectural problems, if there are any at all. I consider the code to be a prototype of the "obvious" way to do application level replication in Cyrus. It works fine for us, but would clearly require a careful audit before going into more widespread use. Support for a number of things is missing simply because we have no need for them right now: seen state handling for shared mail folders, quota roots other than user., and in 2.2+ mailbox annotation and virtual domains spring to mind. I don't think that any of these things would be particularly hard to do, its just a Small Matter of Programming. I would estimate that I've put in about around 3 to 4 months work on the current code and that we would be talking about (at least!) several more man months work between myself and Cyrus developers to get something properly merged. Thats a fairly substantial undertaking for all involved, particularly given that we all have other priorities. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus crashed on redundant platform - need better availability?
On Wed, 15 Sep 2004, Paul Dekkers wrote: On the other hand, if there is a application level redundancy on its way, it doesn't really matter on what platform the machine runs, so it would still make me happier and even with FreeBSD. And I would rather put my money there. Even if it means we'll have to wait for some months, I wouldn't hold out hope of anything being available in "some months". I wrote my replication code two years ago, and submitted it to Rob and Ken about this time last year. Neither I or they have put any significant work into the code since then. As I indicated in my previous message, we all have other priorities right now. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: IOERROR Cannot create anymore mailboxes
On Wed, 15 Sep 2004, Boyle, Bernadette wrote: I have a terrible problem at the moment at the university with new students not being able to receive mail because I cannot create any more accounts. What operating system/filesystem? I currently have 32,765 directories in this account and it seems like perhaps I have reached the quota. Is there anyway to stop this? That number sounds suspiciously close to 32,768 (2**15), particularly if you add 2 for '.' and '..'. Sounds like you've hit a hard limit on the number of files allowed in a directory. Afraid that the only likely solution will be a different filesystem or operating system. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Fri, 17 Sep 2004, Paul Dekkers wrote: Isn't it possible to have equal roles? If all changes are put in some backlog, and a synchroniser process runs on both machines and pushes the backlog (as soon as there is any) to another machine... then you can have the some process on both (equal) servers... Of course there needs to be some more intelligence, but that's basicly what I would expect. We have 16 servers: half the accounts on each system are master copies and half are replicas. Each machine has a small database (a CDB lookup file) to tell it whether a given account is master or slave. The replication engine (which runs independently from the normal master spawned jobs) bails out rapidly if the replica copy of an account is updated: it would proceed to transform the master into a copy of the replica, but that's probably not what you wanted :). I have a tool which allows me to switch the master and replica copy for any (inactive) account without having to shut anything down. This tool also lets me migrate data off onto a third system and immediately create a replica of that. This makes upgrading operating systems a much less fraught task. In my sketch above (really not sure if it works of course) where both have something like a backlog you can like "tail" that backlog and push the update as soon as possible to the second machine. You solve the thing you mention with delays while pushing updates to two servers at the same time. Yes, that's exactly how my code works. Asynchronous replication (which Ken called lazy replication) is fairly easy to do in Cyrus. Synchronous replication, where you only get a response to an IMAP/POP/LMTP command when the data is safely committed to the replica, would involve a much more substantial rewrite of the Cyrus code. That's where block based replication schemes like DRDB have a big advantage: the state that they have to track is much less involved. I'm currently running with a replication cycle of one second on my live servers for "rolling" replication (that's just a name I made up, its not an official term), so on average we would lose of half a second of update traffic for 1/16th of our user base if a single system failed. Further safeguards are possible by keeping copies of incoming mail for a short time on the MTA systems, but that's not really a Cyrus concern. We also replicate to a tape backup spooling engine overnight. The replication engine is rather useful for fast incremental updates. If one server is down it should mean that all tasks can be performed at the other one. I 'm curious how this would look if both servers are still running but cannot reach eachother. If there is indeeed a UUID: what if there are doubles... but I guess that has been taken into account. UUIDs are just a convenient representation of message text, so that you can pass messages by reference rather than value. Duplicates don't matter (though I don't believe that they actual occur given my allocation scheme) so long as the message text is the same. I maintain databases of MD5 checksums for messages and cache text just to be on the safe side. UUIDs were originally just Mailbox UniqueID + Message UID. Unfortunately, UniqueID isn't very Unique: its just a simple hash of the mailbox name. I ended up allocating UUIDs in large chunks from the master process on each machine. If a process runs out of UUIDS (which would take some going as they are allocated in chunks of 2**24), it falls back to call by value. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Fri, 17 Sep 2004, Jure [ISO-8859-2] Pe_ar wrote: So how does this "cyrus in a raid view" sound? It should probalby be called "raims" for redundand array of inexpensive mail servers anyway ;) We call it RAIN: Redundant Array of Inexpensive Nodes. Really cheap Intel servers in our case :) -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Fri, 17 Sep 2004, Ken Murchison wrote: Actually what I was really asking, is are people looking for an active-active config and an active-passive config? I'm not sure that IMAP is ameniable to active-active: the prevalence of UIDs in the protocol means that it would be very hard to resolve the inconsistencies that would occur if a pair of machines ever lost touch. I would be happy to be proved wrong: active-active is clearly better from a system administrator perspective :). -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Sun, 19 Sep 2004, David Lang wrote: 5. Active/Active designate one of the boxes as primary and identify all items in the datastore that absolutly must not be subject to race conditions between the two boxes (message UUID for example). In addition to implementing the replication needed for #1 modify all functions that need to update these critical pieces of data to update them on the master and let the master update the other box. We may be talking at cross purposes (and its entirely likely that I've got the wrong end of the stick!), but I consider active-active to be the case where there is no primary: users can make changes to either system, and if the two systems lose touch with each other they have to resolve their differences when contact is reestablished. UUIDs aren't a problem (each machine in a cluster owns its own fraction of the address space). Message UIDs are a big problem. I guess in the case of conflict, you could bump the UIDvalidity value on a mailbox and reassign UIDs for all the messages, using timestamps determine the eventual ordering of messages. Now that I think about it, maybe that's not a totally absurd idea. It would involve a lot of work though. Pro: best use of available hardware as the load is split almost evenly between the boxes. best availability becouse if there is a failure half of the clients won't see it at all Actually this is what I do right now by having two live mailstores. Half the mailboxes on each system are active, the remainder are passive. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Sun, 19 Sep 2004, David Lang wrote: here is the problem. you have a new message created on both servers at the same time. how do you allocate the UID without any possibility of stepping on each other? With a new UIDvalidity you can choose any ordering you like. Of course one of the two servers has to make that choice, and the potential for race conditions here and elsewhere in an active-active solution is amusing. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Sun, 19 Sep 2004, David Lang wrote: assiming that the simplest method would cost ~$3000 to code I would make a wild guess that the ballpark figures would be 1. active/passive without automatic failover $3k 2. active/passive with automatic failover (limited to two nodes or withing a murder cluster) $4k 3. active/passive with updates pushed to the master $5k 4. #3 with auto failover (failover not limited to two nodes or a single murder cluster) $7k 5. active/active (limited to a single geographic location) $10k 6. active/active/active (no limits) $30k in addition to automaticly re-merge things after a split-brin has happened would probably be another $5k I think that you are missing a zero (or at least a fairly substantial multipler!) from 5. 1 -> 4 can be done without substantial changes to the Cyrus core code, and Ken would be able to use my code as a reference implementation, even if he wanted to recode everything from scratch. 5 and 6 would require a much more substantial redesign and I suspect quite a lot of trial and error as this is unexplored territory for IMAP servers. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Funding Cyrus High Availability
On Mon, 20 Sep 2004, David Lang wrote: Thanks, this is exactly the type of feedback that I was hopeing to get. so you are saying that #5 is more like $50k-100k and #6 goes up from there If anyone could implement Active-Active for Cyrus from scratch in 100 to 150 hours it would be Ken, but I think that its a tall order. Sorry. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Restored mailboxes causes replication to bail
On Thu, 23 Mar 2006, Roland Pope wrote: When I try to replicate a user which has such a 'Recovered' mailbox, the replication client bails out because it seems to treat the 'user.xxx.Recovered' mailbox as the user.xxx INBOX and trys to do a rename (which fails). It sounds you have ended up with two mailboxes with the same UniqueID. This will confuse the replication code which tracks mailboxes by UniqueID rather than by name in order to implement rename. If you delete the cyrus.header file and run reconstruct it should generate a new UniqueID. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_client stop working suddenly
On Wed, 29 Mar 2006, Patrice wrote: what is exactly the 7 signal ? That will depend on the platform that you are using. Signal 7 is SIGBUS (a bus error) on Linux: http://en.wikipedia.org/wiki/SIGBUS. If both sync_client and imapd are affected on a single system, then this would suggest a problem with a library shared between the two or (just conceivably) a subtle hardware problem. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: 2.3 Replication and lost Hardlinks
On Fri, 31 Mar 2006, Roland Pope wrote: It would appear from my testing of the new 2.3 Replication code, that you lose any 'SingleInstanceStore' benefits on the replica as hardlinks on the master cannot be reproduced on the replica. This is what message UUIDs are for. I've been replicating my single instance message stores quite happily for about 3 years now. I don't use Cyrus 2.3. but here is the relevant section from the install-replication document that Ken wrote: Universally Unique Identifiers (UUIDs) An optional, but recommended step is to enable UUIDs for messages. Use of UUIDs improves efficiency by eliminating the synchronization of messages which the "replica" has already received from the "master". Note that UUIDs can be safely enabled and disabled at any time. 1. Define the sync_machineid option in imapd.conf. This option specifies the numeric identifier (1 - 255) of the "master" machine which is used in constructing the UUID for each message on the server. This identifier MUST be unique across all active "backend" servers in a Murder. Example: sync_machineid: 1 2. For each IMAP, NNTP and LMTP service in cyrus.conf, enable the provide_uuid argument. Example: imapcmd="imapd" listen="imap" prefork=5 provide_uuid=1 -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication and Virtual Domains
On Mon, 15 May 2006, Ken Murchison wrote: replication and virtdomains should work together, but currently don't. IIRC, David's code was written against Cyrus 2.1,which was before virtdomains. Yes, that's correct. I never even thought about virtdomains (or altnamespace) when I ported the code. Its on my TODO list, but I can'tgive you a timetable. altnamespace shouldn't be a problem (we use it). The replication code works entirely in the internal namespace. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Errors while doing Replication
On Tue, 13 Jun 2006, David Korpiewski wrote: I'm just about to do some major testing with cyrus and I've been having a terrible time with replication throwing up these types of errors randomly. I don't understand what this "promoting" and "DELETE" are trying to do. I definitely don't want to delete the user.cyrus under any condition, so why is it trying to delete it?! How can I fix these errors oh great gurus of replication? The reason that the user.cyrus is even showing is that in postfix, I tell it to "always_bcc" the cyrus user. This way I have a running queue of all of the messages that I have received on this server. This is so if we failover and I have to use the replica, I can bounce the email that may not have been processed by the sync process when I bring the master back up. Jun 13 16:33:02 lmc1 sync_client[16188]: SEEN davidk user.davidk Jun 13 16:33:02 lmc1 sync_client[16188]: MAILBOX user.cyrus Jun 13 16:33:02 lmc1 sync_client[16188]: DELETE received NO response: This suggests that the Mailbox UniqueID on the replica does not match the mailbox UniqueID on the master. The replication system wants to delete the old version so that it can update the new mailbox. Check the cyrus.header files. Failed to delete user.cyrus: Operation is not supported on mailbox Looking at mboxlist_deletemailbox() there is a specific test to stop the currently logged in user from deleting their own inbox. I infer that you are running replication as user cyrus. I would suggest that you "always_bcc" to some other mailbox, preferably not an admin user. Jun 13 16:33:02 lmc1 sync_client[16188]: Promoting: MAILBOX user.cyrus -> USER cyrus This is just a consequence of the delete operation failing. The replication engine retries before giving up completely. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus 2.3.7 Replication Question
On Thu, 13 Jul 2006, Robert Mueller wrote: I think that should do it. There might be another option as well to make this easier. From the top: 1. Server A is master (sync_client) replicating to Server B (sync_server) 2. Server A dies/is stopped 3. Restart Server B after adding this to the imapd.conf sync_log: 1 4. All IMAP/POP/LMTP connections are directed to Server B Now Server B should be logging all changes to it's sync log, so you don't have to sync_client -u all users. Then to change back, follow from step 6-11 above. It should be possible to leave the "sync_log: 1" enabled on all master and replica systems. The IMAP/POP/LMTP processes will only start to log events when a system becomes a master. My original code allows a mixture of master and replica mailboxes on each Cyrus backend system (with a Perdition like proxy sitting in front to direct logins to specific backend servers). This makes it possible to push inactive accounts back and forth without any downtime. It doesn't work with shared mailboxes, which is why Cyrus 2.3 only supports simple master-replica pairs. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Migration form UW with replication
On Thu, 13 Jul 2006, Michael Menge wrote: http://www-uxsup.csx.cam.ac.uk/~dpc22/cyrus/replication.html says that replication can used to migrate from UW-Imapd to Cyrus, is there any Documentation/Howto howto use this feature? That part wasn't merged into Cyrus 2.3. It was very specific to our environment. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: purge sync. replication folder
On Wed, 12 Jul 2006, Patrice wrote: I have seen that the folder sync. on my replica contains more than 1Go of files. can I delete all the content of it or I must keep the current day with its cache file ? I can do it with or without the replica stopped ? This is equivalent to the stage. area for mail delivery, with a subdirectory for each active sync_server process. Active sync./ directories will clear out automatically when the cache file reaches a certain size. Recent 2.3.x releases clear the directory much more aggressively because of a nasty memory leak which appeared when support for partitions was added. Old sync./ directories which don't correspond to any running sync_server can be cleared out safely by hand on a running system. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: machine mismatch UUID
On Thu, 27 Jul 2006, Fabio Rossetti wrote: The old master used UUIDs ( sync_machineid:1) but using the same configuration on the ex-replica gave this in syslog: Jul 27 00:00:00 exreplica master[23136]: Machine mismatch: 1 != 2 Jul 27 00:00:00 exreplica master[23136]: Couldn't initialise UUID subsystem This means that the sync_machineid setting didn't match the machine=X line in /var/imap-hermes/master_uuid. This file should be created automatically the first time that you run Cyrus with sync_machineid set. Message UUIDs are seeded from sync_machineid and the time at which master starts. These values are recorded in /var/imap-hermes/master_uuid as a sanity check in case the clock on a system is behaving erratically. There is also a timestamp_generation field which can be used to sort things out if a system clock has been playing up, although I've never had to use it. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication woes with a specific mailbox...
On Fri, 28 Jul 2006, Pascal Gienger wrote: sync_client seems to try to RENAME the Inbox to the subfolder "Uni" which is complete nonsense. Do the mailboxes have the same UniqueID (see cyrus.header files)? The replication engine expects UniqueID to be unique. Cyrus makes a bit of a hash of renaming user inboxes (user.XXX -> user.XXX.Uni). Removing the cyrus.header file and running reconstruct should fix the problem. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Another sync-client issue
On Sun, 27 Aug 2006, Bron Gondwana wrote: This is just a "copy-and-paste"o that I noticed while looking for the other issue in the sync code tonight. I think the fix is pretty self evident since exactly the same comment exists elsewhere with the correct error code after it and the value that's there now duplicates a test just above and is an unreachable path. Agreed. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_client bails out after 3 MAILBOXES need upgrading to USER in one run
On Sun, 27 Aug 2006, Bron Gondwana wrote: I've attached my trivial solution (against CVS of last week some time), but I'm thinking a better (as in, less wasteful) solution might be to not return an error at all for a failed mailbox, but instead keep walking the entire tree, and then generate a "USER" event for every mailbox that hasn't been marked yet. My original code (which we are still running: I'm not in any hurry to upgrade to 2.3) sorts mailbox actions by user. If a single mailbox action associated with a user fails the rest are discarded and a USER event is generated. If the USER event fails it locks the given user out of the mboxlist and tries again. This is close to what you describe above. From memory the 3 retries thing was introduced to cope with transient problems on shared mailboxes, caused by mailboxes moving around under the replication engines feet. No promotion is possible in this case. Ken and David - is there a reason why you chose to pass a single "MAILBOXES" command with multiple mailboxes to the backend rather than single mailbox commands? The little birdy in my head is whispering (it does that at 1am after many hours of debugging) that it has something to do with supporting renames. Rename and copying messages between mailboxes. With single mailbox commands RENAME becomes DELETE + CREATE/UPLOAD (which would work, but would be a pain if a GByte mailbox was involved). COPY would upload new messages rather than reusing the single instance store on the replica. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_client stalls the rest of cyrus while 'no route to host'
On Sun, 27 Aug 2006, Bron Gondwana wrote: To tell you the truth, I'm seriously considering writing a replacement to sync_client that does a bunch of different things including multiple replicas, maintaining log files, etc. All of this drops out pretty easily from a pattern which produces a single log file per day and calls the sync_client fork children with a byte-range on the log file to run rather than moving the file and then running the copy. I think that you would be better off with multiple log files and multiple sync_client processes, one for each replica. That way each replication stream is independent and can progress at its own best speed. Particularly important if a replica dies (or is shut down for routine maintenance) and needs to catch up from a big backlog of transactions. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync client bailing out
On Tue, 12 Sep 2006, F. Rossetti wrote: I am running two cyrus 2.3.7 servers with replication. The replica server crashed, and I fsck'd the /var/imap/* and /var/spool/imap/* partition finding some errors. Now everything seems to have come back up but the replication still has problems. The client process on the master bails out frequently with these errors in the log: It looks like you have some corruption in cyrus.index files on the replica following the crash. Try running reconstruct on the mailboxes in question. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_server "memory leak" with giant new mailbox first sync
On Sun, 10 Sep 2006, Wesley Craig wrote: My solution (such as it is) was to reduce the wasteful amount of space sync_server was allocating per message: [...] The times-5 is completely gratuitous. In fact the pre-allocation of any memory for paths is wasteful, but I was not up for reengineering the memory scheme in sync_server at the time. This started out life as: sprintf(tmp, "%lu.", l->count); result->msg_path = xmalloc(l->stage_dir_len+strlen(tmp)+2); At around 35 bytes a shot, you get rather more of these to the MByte. If I recall correctly, the "5 * (MAX_MAILBOX_PATH+1)" was put in to support partitions. A lookup table for partitions and two integers (one for the partition number, one of the message number on that partition) should be all that is needed to reconstruct the paths at a later date. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_client bails out after 3 MAILBOXES need upgrading to USER in one run
On Tue, 12 Sep 2006, Wesley Craig wrote: On a related note, what was the problem with accepting the Cambridge patches for delayed folder deletion? I'm interested in working on getting that or similar code accepted. Now that we have delayed expunge for messages, we continue to run tape backups only for the case where users inadvertently delete folders. Replication and delayed expunge were added by Ken (working as a contractor) back before he joined CMU. Replication was sponsored by Columbia. Delayed expunge was sponsored by Fastmail, but mostly as a performance enhancement. Unexpunge was just a nice side effect. I believe that Ken implemented the delayed expunge from scratch. My original "two expunge expunge" code is rather more involved: 1) Users can access expunged mail and deleted mailboxes using magic mailbox hierarchies (.EXPUNGED/ and .DELETED/). 2) It hooks into the quota system to record the amount of expunged space in each quota root. Messages are automatically expired when global or per quota root limits are reached. With hindsight (1) was a daft idea on my part. Our users struggle with the idea of multiple mailboxes in their account, let alone magic mailbox hierarchies. (2) is arguably useful if you don't have infinite storage on your IMAP backends. There are however lots of other spool partitions which can fill up under a determined denial of service attack. Without (1) or (2), delayed mailbox deletion is really nothing more exciting than a RENAME operation to some part of the mailbox hierarchy without a quota root that only the system administrator can access. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: replication, how to see if it 'up to date'
On Thu, 28 Sep 2006, Rudy Gevaert wrote: Does anybody know how to see if the sync replication is up to date? All suggestions are welcome. I replicate all users from all masters to replicas about once a month and check the output for any inconsistancies. Something like: cyrus-23[cyrus:cyrus]$ replicate -s cyrus-24 -v -v -u dpc22 USER dpc22 USER_ALL dpc22 ENDUSER where "replicate" is just a little wrapper around sync_client. We also maintain databases of MD5 checksums for messages and cache entries, generated by make_md5. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: performance on large inboxes
On Wed, 8 Nov 2006, Phil Pennock wrote: The relevant stuff is HERMES_CACHE_MOST in mailbox.c; I've really no idea whether or not these changes are roughly independent and if they can be pulled out. That was merged a long time back. doc/text/changes: Changes to the Cyrus IMAP Server since 2.2.1 * Significantly improved message header caching (based in large part on code supplied by David Carter <[EMAIL PROTECTED]> from the University of Cambridge) -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: performance on large inboxes
On Thu, 9 Nov 2006, Marten Lehmann wrote: That was merged a long time back. doc/text/changes: is it enabled by default? Or do I have to specify which headers in particular shall be cached? It is enabled by default, but only applies to messages delivered after you started to run 2.2.1 or later (unless you reconstruct). Only a subset of X-Whatever headers are cached, but just about everything else is cached apart from the Received headers. I should point out that the code in Cyrus is a significant improvement on my original version, courtesy of Rob Siemborski. The version in Cyrus makes it easy to add headers to the list which is cached. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Does the quota include deleted but not yet expunged mails in v2.3 with delayed expunge?
On Thu, 9 Nov 2006, Farzad FARID wrote: I'm running Cyrus Imapd 2.3.7 with the delayed expunge mode. Do the messages deleted by the user, but not yet expunged by the system, count in the user's quota? I'd say yes but I'd like a confirmation. Yes. \Deleted is just another flag on messages. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Does the quota include deleted but not yet expunged mails in v2.3 with delayed expunge?
On Thu, 9 Nov 2006, David Carter wrote: On Thu, 9 Nov 2006, Farzad FARID wrote: I'm running Cyrus Imapd 2.3.7 with the delayed expunge mode. Do the messages deleted by the user, but not yet expunged by the system, count in the user's quota? I'd say yes but I'd like a confirmation. Yes. \Deleted is just another flag on messages. Ah. If you meant expunged by the user but not yet expired by the system, then these don't count again the quota, just as Paul Engle suggests. Sorry about any confusion. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: namespace question
On Thu, 9 Nov 2006, Ivo Petrov wrote: I use alternate namespace in order to get representation of the folders to be as they are not subfolders of INBOX but instead to be on the same level as INBOX. Altnamespace doesn't allow mailboxes under INBOX. The problem is the mapping between altnamespace and the internal Cyrus namespace. Consider: INBOX <--> user.dpc22 INBOX.foo <--> user.dpc22.foo foo<--> user.dpc22.foo -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: How to tell POP3 daemon to mark messages as read?
On Fri, 3 Nov 2006, Georgy Goshin wrote: Is it possible to mark all messages downloaded via POP3 daemon as read so that next time user connected via IMAP could see wich messages are new? No, the POP daemon doesn't open the seen message database. There would be a fairly substantial performance hit if it did. I believe (its been some time now) that the UW server works the way that you want. The POP protocol doesn't have any concept of \Seen messages, so there isn't really a right or wrong way to do this. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: load balancing at fastmail.fm
On Mon, 12 Feb 2007, Marten Lehmann wrote: what do you think about moving the mailspool to a central SAN storage shared via NFS and having several blades to manage the mmapped files like seen state, quota etc.? Why do you need NFS? The whole point of a SAN is distributed access to storage after all :). So still only one server is responsible for a certain set of mailboxes, but these SAN boxes have nice backup and redundancy features which are hard to get with common servers It depends how much you trust your SAN. Some of my colleagues who run a SAN have had no end of grief. At which point you are dependant on the abilities of the vendor to diagnose and fix problems. It was this experience that encouraged me to try application level replication with lots of small servers in the first place. At least that way I can keep a close eye on what the various copies are up to. A SAN doesn't protect you if your filesystem decides to explode: I believe that Fastmail have direct experience of this. Two independent copies of the data allows you to keep running a service for the hours that an fsck typically takes to complete with file per msg stores on large modern disks. It also means rather less stress if the fsck fails to complete. I've heard horror stories about all the common Linux filesystems and I've personally watched fsck.ext3 (supposedly the safest option) unravel a filesystem, with thousands of entries left in lost+found. ZFS looks nice. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: load balancing at fastmail.fm
On Mon, 12 Feb 2007, Marten Lehmann wrote: because NFS is the only standard network file protocol. I don't want to load a proprietary driver into the kernel to access a SAN device. Fair enough, although NFS is likely to be really rather slow compared to a block device which just happens to be accessed via a fibre channel link. I would be surprised if NFS worked given that it is only a approximation to a "real" Unix filesystem. Cyrus really hammers the filesystem. I've heard horror stories about all the common Linux filesystems and I've personally watched fsck.ext3 (supposedly the safest option) unravel a filesystem, with thousands of entries left in lost+found. ext3 with journal? I have never experienced this. It was in a RAID set which had had a dodgy disk, but there was a definite urk moment when I saw what fsck had done. Fortunately not critical data. ZFS looks nice. Well, but you are on your own because this project for linux is pretty young. I don't have any problem with OpenSolaris, though it would be a little amusing given that we moved from Solaris to Linux about 4 years back. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: load balancing at fastmail.fm
On Mon, 12 Feb 2007, urgrue wrote: SAN really has nothing to do with replication. You have your data somewhere (local or external disks, local/ext raid, NAS, SAN, etc), and youve got your various replication options (file-level, block-level, via client, via server, etc). I agree that storage and replication are orthogonal issues. However, if a lump of storage is no longer a single point of failure then you don't have to invest (or gamble) quite as much to make that storage perfect. Software is rarely perfect, as the early history of replication in Cyrus 2.3 demonstrates. If the software isn't itself a single point of failure then it can at least be monitored and fixed. On which note I should pass my thanks to Bron Gondwana, Wes Craig and anyone else who has been working on replication there. None of these are a replacement for backups. Absolutely, I agree. Exterprise storage and replication are both just strategies to reduce the frequency that you need to resort to backup. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Potential replica message file corruption/replacement
On Fri, 16 Feb 2007, Bron Gondwana wrote: Looks innocent, doesn't it... Mea culpa (and a definite "Argh, how did I miss _that_" when it was pointed out to me yesterday). I would advise anyone who has been using replication for any length of time to undertake an audit of the files on their replicas to ensure that none of them have been replaced by this, because if you need to "fail over" you could present users with emails that are not their own. A simple size check will find almost all cases, compare what the imapd returns for rfc822.size with the size of the file on disk. If you want to get fancy - compute the sha1 or similar of the file at each end and compare that. This incident underlines the need for automated sanity checks. People shouldn't just blindly trust the replication system. I generate (and constantly regenerate) checksums for message bodies and cache entries. On four occasions this has picked up oddities which in hindsight were obviously this bug. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: single instance store and replication
On Thu, 1 Mar 2007, Jerome Nenert wrote: I'm using Cyrus replication. After several tests, it seems that the single instance store facility is not replicated. I mean, the same message sent to several recipient is stored once on the master, but stored several times on the slave. Is there a special thing to do to activate single instance store replication or it just doesn't exist yet? Message UUIDs are used to replicate the single instance store (see docs/text/install-replication). This won't have much effect when you first replicate a mailstore as sync_server in 2.3 only tracks the last few thousand messages that have been uploaded. It becomes much more effective when a replica has been seeded and you switch to "rolling" replication. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: FastMail.FM patchset - new patches
On Thu, 15 Mar 2007, Bron Gondwana wrote: * Make UUIDs work at all. The initialisation order of the UUID subsystem was wrong, so we had very few messages with a non-zero UUID. message_uuid_master_init() should only be called after become_cyrus(). Every time that message_uuid_master_next_child() overflows, master needs to update MASTER_UUID_FILE. Is MASTER_UUID_FILE owned by a user other than cyrus on your systems? I can't see any other obvious reason that moving message_uuid_master_init() up a few lines would help. service_create() is binding ports, which obviously has to happen as root, while masterconf_getsection() is just parsing master.conf. I am probably missing something obvious here, but the current ordering works for me. * MD5 UUIDs - we've created a new scheme for UUID generation, of the format: 02[first 11 bytes of message file md5]. This allows some basic integrity checking of the file on disk, and is still plenty random. Also adds the non standard IMAP FETCHable items UUID, RFC822.MD5 (calculated on the fly), RFC822.FILESIZE (does a stat or looks at the MMAP result if something else needs it) I don't think that this is safe. It is important that the UUIDs really are unique, which is the reason for the paranoia in message_uuid_master_init. The assertion: Is it safe? - we calulated that with one billion messages you have a one in 1 billion chance of a birthday collision (two random messages with the same UUID). They then have to get in the same MAILBOXES collection to sync_client to affect each other anyway. Isn't the case: UUIDs span all MAILBOXES and APPEND event until a restart. If a UUID appears in one event and then is referenced by a second event some minutes later then the first message seen will be reused. At the moment sync_client in 2.3 tracks the last few thousand messages by UUIDs. My original code tracked the last few hundred thousand messages (diminishing returns, but useful when seeding accounts). There is a much greater chance of collisions there than just comparing two messages. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: FastMail.FM patchset - new patches
On Thu, 15 Mar 2007, Rob Mueller wrote: May not be true, but: Is it safe? - we calulated that with one billion messages you have a one in 1 billion chance of a birthday collision (two random messages with the same UUID). Is true. Fair enough. With hindsight I should probably have defined message UUIDs to be the full MD5 hash: 128 bits isn't that much worse than 96 bits per message. What is the CPU overhead like for calculating MD5 sums for everything on the fly? UUIDs started out life as Mailbox UniqueID (64 bits) plus Message UID (32 bits), hence the size and rather unfortunate name. The hash algorithmn used to generate mailbox uniqueIDs is a bit basic, which is why I switched to generating them on the fly from master. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: FastMail.FM patchset - new patches
On Thu, 15 Mar 2007, John Capo wrote: masterconf_getsection() calls add_service() which inits the service structures with have_uuid but have_uuid is not set till after the services are initialized. Ah, have_uuid is new in 2.3. That line definitely needs to move, but I think that the message_uuid_master_init() call should stay where it is. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- master/master.c-DIST2007-03-16 08:45:56.0 + +++ master/master.c 2007-03-16 08:46:13.0 + @@ -1938,6 +1938,8 @@ /* set signal handlers */ sighandler_setup(); +have_uuid = (config_getint(IMAPOPT_SYNC_MACHINEID) >= 0); + /* initialize services */ for (i = 0; i < nservices; i++) { service_create(&Services[i]); @@ -1954,7 +1956,6 @@ } } -have_uuid = (config_getint(IMAPOPT_SYNC_MACHINEID) >= 0); if (have_uuid && !message_uuid_master_init()) { syslog(LOG_ERR, "Couldn't initialise UUID subsystem"); exit(EX_OSERR); Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: FastMail.FM patchset - new patches
On Fri, 16 Mar 2007, David Carter wrote: Ah, have_uuid is new in 2.3. That line definitely needs to move, but I think that the message_uuid_master_init() call should stay where it is. Or even (as per the fastmail patch). -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. --- master/master.c-DIST2007-03-16 08:45:56.0 + +++ master/master.c 2007-03-16 09:49:01.0 + @@ -1931,6 +1931,8 @@ init_snmp("cyrusMaster"); #endif +have_uuid = (config_getint(IMAPOPT_SYNC_MACHINEID) >= 0); + masterconf_getsection("START", &add_start, NULL); masterconf_getsection("SERVICES", &add_service, NULL); masterconf_getsection("EVENTS", &add_event, NULL); @@ -1954,7 +1956,6 @@ } } -have_uuid = (config_getint(IMAPOPT_SYNC_MACHINEID) >= 0); if (have_uuid && !message_uuid_master_init()) { syslog(LOG_ERR, "Couldn't initialise UUID subsystem"); exit(EX_OSERR); Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: sync_client doesn't sync sieve scripts
On Mon, 26 Mar 2007, Dmitriy Kirhlarov wrote: I have properly configured sync between two cyrus-imapd 2.3.8 servers. Mailboxes rolling synchronization work good. This also updates sieve scripts. Now I want to synchronized sieve scripts too. sync_client -v -s sync_client -v -s $username -s is new in 2.3, but it looks like it was only there for testing. The manual page says: Principally used for debugging purposes: not exposed to sync_client -u should replicate an entire user including the Sieve files. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
On Sat, 9 Jun 2007, Rob Mueller wrote: >> I run it directly, outside of master. That way when it crashes, it >> can be easily restarted. I have a script that checks that it's >> running, that the log file isn't too big, and that there are no log- >> PID files that are too old. If anything like that happens, it pages >> someone. > > Ditto, we do almost exactly the same thing. And for that matter, so I do. > I think there's certain race conditions that still need ironing out, > because rerunning sync_client on the same log file that caused a bail > out usually succeeds the second time. I suspect that the problem is with mailbox renames, which are not atomic and can take some time to complete with very large mailboxes. sync_client retries a number of times and then bails out. if (folder_list->count) { int n = 0; do { sleep(n*2); /* XXX should this be longer? */ ... } while (r && (++n < SYNC_MAILBOX_RETRIES)); if (r) goto bail; } This was one of the most significant compromises that Ken had to make when integrating my code into 2.3. My original code cheats, courtesy of two other patches: HERMES_FAST_RENAME: Translates mailbox rename into filesystem rename() where possible. Useful because sync_client chdir()s into the working directory. Would be less useful in 2.3 with split metadata. HERMES_SYNC_SNAPSHOT: If mailbox action fails, promote to user action (no shared mailboxes) If user action fails then lock user out of the mboxlist and try again. Together with my version of delayed expunge this pretty much guarantees that things aren't moving around under sync_client's feet. Its been an awful long time (about a year?) since I last had a sync_client bail out. We are moving to 2.3 over the summer (initially using my own original replication code), so this is something that I would like to sort out. Any suggestions? -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Cyrus with a NFS storage. random DBERROR
On Sun, 10 Jun 2007, Rob Mueller wrote: > I can try and keep an eye on bailouts some more, and see if I can get > some more details. It would be nice if there was some more logging about > why the bail out code path was actually called! It typically means that something deep in libimap has thrown an error. sync_client logs the only information that it has (the return code r). It probably wouldn't hurt to try and log the current mailbox/user in some consistent fashion. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication question - cross replication?
On Thu, 14 Jun 2007, Nels Lindquist wrote: > I'm setting up a high-availability mail server setup with two boxes that > will essentially be mirrors of each other. > > If both are configured for local delivery, can I have them replicate > each other if I utilize UUIDs? No. IMAP is not well suited to active-active replication. Replication in Cyrus is strictly active-passive. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus 2.3.8, internal dates not stored in message file time ?
On Thu, 21 Jun 2007, Nicolas KOWALSKI wrote: > I have noticed that copying messages from one folder to another one does > keep messages internal dates but does not set message files write time > in the destination folder, as 2.2.12 does. mailbox_copyfile() hasn't changed between 2.2.12 and 2.3.8. If source and target mailbox are on the same partition then the message should be copied using link(): both hard links share a single timestamp. Otherwise Cyrus has to create a new file and copy the data by hand. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: cyrus 2.3.8, internal dates not stored in message file time ?
On Fri, 22 Jun 2007, Nicolas KOWALSKI wrote: > The source and target mailboxes are on the same user account, same > partition, but strace shows that link is not used in 2.3.8. The > following traces are obtained with strace, when copying messages from > testbox to testbox3 with cyrus 2.2.12 and to testbox4 with cyrus > 2.3.8. In both case, I used the same configuration/mailstore/client: > > [...] > > There must be something really wrong in my configuration... Is singleinstancestore disabled in your imapd.conf? While mailbox_copyfile() hasn't changed, the parent routine index_copy() appears to have gained an extra argument, and the top level cmd_copy handler in imapd.c has: r = index_copy(imapd_mailbox, sequence, usinguid, mailboxname, ©uid, !config_getswitch(IMAPOPT_SINGLEINSTANCESTORE)); -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: 8G RAM in 32bit platform
On Fri, 13 Jul 2007, Patrick T. Tsang wrote: > We will start up the mail server with 4G RAM. > As I know the 32bits cannot handle RAM more than 3.2G. > > The client plans to upgrade the RAM to 8G in coming years. > Can the 64bits platform is the only solution to it? You don't say which CPU or operating system you are using. The Linux bigsmp kernel supports PAE extensions on IA32 platforms: all 8 GBytes will be available as buffer cache, which is what matters to Cyrus. 64 bit pointers don't really do anything: no single process in Cyrus needs 2 GBytes of address space. 64 bit integer arithmetic would be a slight benefit for quota arithmetic (unsigned long long). However my systems spend about 2% of their time in user CPU state according to vmstat. You really aren't going to notice on any modern Intel/AMD CPU. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LTMPD rejecting large messages, maxmessagesize is _not_ set
On Tue, 10 Jul 2007, Chris St. Pierre wrote: > LMTPD is rejecting large messages; I've been unable to figure out the > exact threshold, but I am seeing messages like this in my Postfix logs: > > Jun 28 20:23:12 vostok postfix/qmgr[9323]: 22F5373D6F6: > from=<[EMAIL PROTECTED]>, size=16243464, nrcpt=1 (queue active) Jun 28 > 20:23:22 vostok postfix/lmtp[13405]: warning: non-LMTP response from > imap.nebrwesleyan.edu[10.1.1.31]: sendmail: fatal: > [EMAIL PROTECTED](76): Message file too big I don't think that Cyrus generated that error messages. Try "strings" on the lmtpd binary. Errors from Cyrus should be all variants on: ec IMAP_MESSAGE_TOO_LARGE, "Message size exceeds fixed limit" Is sendmail/postfix using a staging partition which has run out space? -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication to more than one replica?
On Fri, 10 Aug 2007, Per olof Ljungmark wrote: > It would be a way to keep a second offline replica for backing up to a > tape archive, which is what I plan to do. This is certainly what we do, and it seems to work nicely. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: storing mail across several cyrus partitions
On Fri, 31 Aug 2007, Artem Bokhan wrote: > Partition or sub-partition -- no matter. The purpose -- to store the > mail of a single user accross different storages. Yes, you can do this. /etc/imapd.conf: partition-default: /spool/cyrus/mailstore partition-test: /spool/cyrus/test/data metapartition-test: /spool/cyrus/test/meta partition-test2: /spool/cyrus/test2/data metapartition-test2: /spool/cyrus/test2/meta partition-test3: /spool/cyrus/test3/data metapartition-test3: /spool/cyrus/test3/meta Raw IMAP as a cyrus admin user (simply because it gives more feedback than the cyradm rename command), using the optional third argument to rename. Start out with three mailboxes on default partition: . LIST "user/dpc22" * * LIST (\HasChildren) "/" "user/dpc22" * LIST (\HasNoChildren) "/" "user/dpc22/bar" * LIST (\HasNoChildren) "/" "user/dpc22/foo" . OK Completed (0.000 secs 4 calls) Move user/dpc22/foo and user/dpc22/bar to test2 and test3: . RENAME user/dpc22/foo user/dpc22/foo test2 * OK rename user/dpc22/foo user/dpc22/foo . OK Completed . RENAME user/dpc22/bar user/dpc22/bar test3 * OK rename user/dpc22/bar user/dpc22/bar . OK Completed The only gotcha is that each rename moves all subsidiary mailboxes: . RENAME user/dpc22 user/dpc22 default * OK rename user/dpc22 user/dpc22 * OK rename user/dpc22/bar user/dpc22/bar * OK rename user/dpc22/foo user/dpc22/foo . OK Completed -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: storing mail across several cyrus partitions
On Sun, 2 Sep 2007, Bokhan Artem wrote: > Sorry, I didn't understand you clearly... Did you mean, that subfolders > of single user may be moved across partitions? Yes. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication and single instance store
On Tue, 4 Sep 2007, Bron Gondwana wrote: > for all the users across which the single instance store needs to > apply, then run 'sync_client -r -f $file'. I typically use "-u -f" to do this. However: > Creating files like this and passing them with the -f flag forces > sync_client to consider them in the same run, so it "finds" the matching > message on the replica. sync_server maintains a fairly modest UUID cache on the server side: 1000 messages in 2.3. A restart is negotiated after each UPLOAD command. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Replication and single instance store
On Tue, 4 Sep 2007, Bron Gondwana wrote: > Ah - yeah, that's right. Except that the restart only got negotiated > after each folder was processed, and if you're pushing a new folder with > 200,000 messages (say, after a user move in our case) then that got a > bit memory hungry and all sorts insane. Yes, this was the 5*MAX_MAILBOX_PATH allocation for each message when support for partitions was added. My original code cached 300k messages on the server between restarts (without any substantial memory leak). Better, but not perfect. > Does this mean there is no way to get single-instance-store on a replica > if you're rebuilding it from scratch? No. You would need a database which maintained a persistent mapping between UUID and a list of files on each partitition which are that UUID. I'm open to suggestions. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: storing mail across several cyrus partitions
On Tue, 4 Sep 2007, Artem Bokhan wrote: > Checking several 1Tb-20Tb filesystem with large amount of small files > will take a while... based on my experience xfs sometimes need full > check after power failure, for example. I also gather that the xfs repair tools need an extraordinary amount of memory to run on large file systems: http://oss.sgi.com/archives/xfs/2005-08/msg00045.html -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: Unexpected response MessageID 000000000000000000000000
On Thu, 27 Sep 2007, Rudy Gevaert wrote: > I'm seeing this error. > > /sync_client[28862]: RESERVE: Unexpected response MessageID > in > > Does anybody have an idea what it is triggered by. Bug fixed in 2.3 CVS earlier this week. (the missing (i < count) in cmd_reserve). I attach the message that I sent to cyrus-devel. sync_client will be ignoring the spurious responses. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. /* === */ >From [EMAIL PROTECTED] Tue Sep 25 11:00:38 2007 Date: Tue, 25 Sep 2007 10:53:56 +0100 (BST) From: David Carter <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Subject: sync_server RESERVE command Two separate bugs fixes, against current 2.3 CVS. The missing i++ means that we would have failed to RESERVE some messages where a given GUID appears multiple times in a mailbox. Not a huge deal. The missing (i < count) was potentially serious: we fall off the end of the ids array where RESERVE doesn't involve the last message in a mailbox. I think that we got away with it. My reasoning follows: If you are really unlucky 12/20 bytes of allocated but uninitiated memory (ids[count]) will match one of the GUIDs later in the mailbox. This will cause cmd_reserve() to reserve that message and issue a spurious response. Fortunately sync_client will moan and then ignore the response if it wasn't expecting the GUID. If it _was_ expecting the GUID then sync_server has reserved a legitimate copy. In fact this should be impossible. If sync_client wanted a message with that GUID, it would have asked for that copy in that mailbox, so the ids array would have been longer. ids[count] is always allocated. There is a one in hundred chance that ids[count+1] is not, which would lead to a segmentation fault if ids[count] matched a GUID. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Index: imap/sync_server.c === RCS file: /cvs/src/cyrus/imap/sync_server.c,v retrieving revision 1.11 diff -u -d -r1.11 sync_server.c --- imap/sync_server.c 24 Sep 2007 12:48:32 - 1.11 +++ imap/sync_server.c 25 Sep 2007 09:28:09 - @@ -1465,14 +1465,16 @@ goto cleanup; } -for (i = 0, msgno = 1 ; msgno <= m.exists; msgno++) { +for (i = 0, msgno = 1 ; (msgno <= m.exists) && (i < count); msgno++) { mailbox_read_index_record(&m, msgno, &record); if (!message_guid_compare(&record.guid, &ids[i])) continue; -if (sync_message_find(message_list, &record.guid)) +if (sync_message_find(message_list, &record.guid)) { +i++; continue; /* Duplicate GUID on RESERVE list */ +} /* Attempt to reserve this message */ snprintf(mailbox_msg_path, sizeof(mailbox_msg_path), Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: IOERROR: writing cache file
On Fri, 28 Sep 2007, Bron Gondwana wrote: > I can tell you I sleep a lot easier at night knowing that there are > replication systems in place so we can lose an entire server and > at worst a few emails will be lost. Indeed. It sounds like we might be getting a second machine room Real Soon Now, which was kind of the whole point when I started back in 2002. Hurrah. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Fri, 5 Oct 2007, Rob Mueller wrote: > b) make sure you have the right filesystem (on linux, reiserfs is much > better than ext3 even with ext3s dir hashing) and journaling modes A data point regarding reiserfs/ext3: We are in the process of moving from reiserfs to ext3 (with dir_index). ext3 seems to do substantially better than reiserfs for us, especially for read heavy loads (squatter runs at least twice as fast as it used do). I think that this is partly because ext3 does more aggressive read ahead (which would be a mixed blessing under heavy load), partly because reiserfs suffers from fragmentation. I imagine that there is probably a tipping point under the sort of very heavy load that Fastmail see. data=ordered in both cases. data=journal didn't seem to make any difference with ext3. data=journal with reiserfs caused amusing kernel memory leaks, which it looks like Fastmail also hit recently. An dedicated journal device would probably make a big difference with data=journal. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Fri, 5 Oct 2007, Bron Gondwana wrote: > We ran over 100,000 users on a single backend for over a year without > problems, but then we had a RAID array failure (3 disks within a day) > with 2Tb of data on a single RAID unit We have pretty much given up on RAID 5 because of the reconstruct times with large disks. Our new systems are 12 disk RAID 10 (plus hot spares). I think that gives about the same usable capacity as your 2 x RAID5 + 2 x RAID1 setup, but better redundancy. There would be twice as much work to restore the single RAID 10 set if it failed. I plan some experiments with split meta next year. My gut feeling is that 12 slow disks will be better than 4 faster disks given the short command queues in SATA NCQ, but I'm entirely willing to be proved wrong. Multiple partitions would certainly help with any bottlenecks at the VFS layer. I suppose that 8 SATA disks for the data and four 15k SAS disks for the metadata would be a good mix. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Sat, 6 Oct 2007, Rob Mueller wrote: > Are you comparing an "old" reiserfs partition with a "new" ext3 one where > you've just copied the email over to? If so, that's not a fair comparison. No, a newly created partitions in both cases. Fragmented partitions are slower still of course. > Give it a month or two of active use though (delivering new emails, > deleting old ones, etc), and everything starts getting fragmented again. > Then ext3 really started going to crap on us. Machines that had been > absolutely fine under reiserfs, the load just blew out to unuseable > under ext3. We've only been using ext3 for about 3 months now, so I may still have this to look forward to :). > Talking with Chris Mason about this, data=journal is faster in certain > scenarios with lots of small files + fsyncs from different processes, > exactly the type of workload cyrus generates! I can't see much difference on our Cyrus systems, but battery backed write cache on our RAID controllers probably masks a lot of the change. I agree that it theory it should make a very substantial difference. > As it turns out, the memory leaks weren't critical, because the the > pages do seem to be reclaimed when needed, though it was annoying not > knowing exactly how much memory was really free/used. Okay, I think that we had a different kernel memory bug. We were running out of memory after 24 hours, and a 20 line test program could exhaust memory in seconds. This bug was in SLES four years back, and it was still there the last time that I looked (some months back now). -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: LARGE single-system Cyrus installs?
On Sat, 6 Oct 2007, Rob Mueller wrote: That's strange. What mount options are/were you using? We use/used: reiserfs - rw,noatime,nodiratime,notail,data=journal ext3 - noatime,nodiratime,data=journal Same, but data=ordered in both cases If you weren't using "notail" on reiserfs, that would definitely have a performance impact. Definitely using notail. Wow weird, must be something different. What kernel was it? Do you know where the memory leak was occuring? Standard SLES kernels for SLES9. The memory leak could be show by mmap() on a single file (see attachment). Kernel memory explodes, and nothing is released when the program exits. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. #include #include #include #include #include #include #include #include #define BLOCK_SIZE (4096) static void fatal(char *err) { fprintf(stderr, "%s\n", err); exit(1); } int main(int argc, char *argv[]) { int i; int fd, fd2; char *s; if (argc != 3) fatal("Args: "); for (i=0; i < 10 ; i++) { if ((fd = open(argv[1], O_RDONLY)) < 0) fatal("open"); if ((s=mmap(NULL, BLOCK_SIZE, PROT_READ, MAP_SHARED, fd, 0)) == NULL) fatal("mmap"); if ((fd2 = open(argv[2], O_WRONLY|O_CREAT|O_TRUNC, 0666)) < 0) fatal("open"); if (write(fd2, s, BLOCK_SIZE) < BLOCK_SIZE) fatal("write"); if (close(fd2) < 0) fatal("close"); if (munmap(s, BLOCK_SIZE) < 0) fatal("munmap"); if (close(fd) < 0) fatal("close"); } return(0); } Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: squatter running longer than 24 hours
On Sun, 21 Oct 2007, Vincent Fox wrote: > I have seen squatter run more than 24 hours. > > This is on a large mail filesystem. I've seen it start up a > second one while the first is still running. Should I: > > 1) Forget about squatter > 2) Remove from cyrus.conf, run from cron every other day > 3) Find some option to cyrus.conf for same effect as #2? I squat a fraction of mailboxes each night using: http://www-uxsup.csx.cam.ac.uk/~dpc22/cyrus/patches/2.3.8/squatter.patch For example: squatter -s -m 0 -M 7 would update the squat indexes for 1 in 7 mailboxes, based on modulo arithmetic on the mailbox UniqueID. squatter would really benefit from incremental updates. At the moment a single new message in a mailbox containing 20k messages causes it to read in all the existing messages in order to regenerate the index. Unfortunately, the code is rather impenetrable. I infer that it is collecting information about adjacent characters in the message body. Presumably a 5 character search term provides 4 required pairing as a prefilter from the squat engine before message by message search kicks in. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Re: IOERROR: reading message: unexpected end of file (message_copy_strict)
On Tue, 23 Oct 2007, Ken Murchison wrote: > Your problem is most likely related to using NFS. NFS has never been > recommended for Cyrus because is doesn't play nice with mmap() and > flock(), both of which are critical to the operation of Cyrus. While I agree entirely with "don't use Cyrus over NFS", I see these errors using a local filesystem. A quick grep pins the likely cause down to message_copy_strict(), which is called by append_fromstream(). I don't think that this is anything more sinister than TCP connections dropping out partway through a large IMAP APPEND operation. Entirely safe. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH. Cyrus Home Page: http://cyrusimap.web.cmu.edu/ Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
POP3 "LAST" command
Hi everyone, I'm in the process of migrating from the UW IMAP and POP servers to Cyrus. One of my fetchmail users has picked up an inconsistency in the way that the two POP servers handle the POP3 LAST command. The LAST command appears to be obsolete, but is still fetchmail's default mode of operation for leave mail on server if it available on the server... It looks like the UW server hooks into the \Seen message state used by its companion IMAP server, while the Cyrus POP server does not. Consequently, the information presented by "LAST" has meaning across login sessions for the UW server but not for Cyrus which always responses: C: LAST S: +OK 0 right after authentication takes place. Is there a particular reason for the difference in behaviour? I couldn't find anything in the info-cyrus list archives. Any other suggestions? The fetchmail manual page says: "Under POP3, blame RFC1725. That version of the POP3 protocol specification removed the LAST command, and some POP servers follow it (you can verify this by invoking fetchmail -v to the mailserver and watching the response to LAST early in the query). The fetchmail code tries to compensate by using POP3's UID feature, ... which implies that fetchmail automatically falls back to using UIDL if LAST isn't available. Are there likely to be nasty consequences of just disabling the LAST command in pop3d.c? At the moment it looks like any POP clients which are configured to use LAST rather than UIDL will download all of the messages at each poll interval which is rather undesirable. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.
Re: POP3 "LAST" command
On Tue, 24 Jun 2003, Ken Murchison wrote: > RFC 1460 says nothing about maintaining this info across sessions. It > also doesn't prohibit it. The Cyrus behavior seems perfectly reasonable > and correct according to the ancient spec which defines it. Yep. I wasn't suggesting the behaviour is unreasonable, just different than my installed user agents are used to. It caused one of my beta testers a bit of an unpleasant surprise when they arrived this morning :). > I'd say that your problem is with fetchmail. Since LAST is obsolete > (quite possibly because of the ambiguity you have seen), fetchmail > should default to using UIDL, and then fallback to LAST iff UIDL isn't > available. UIDL was designed specifically to keep track of messages and > SHOULD be the preferred mechanism used by clients. The fetchmail manpage actually advocates LAST over UIDL on the grounds: But this doesn't track messages seen with other clients, or read directly with a mailer on the host but not deleted afterward. so it looks like fetchmail does consider LAST state to be persistent. It does however encourage people to use IMAP rather than POP when available. I guess I'll have to add some monitoring on our existing servers to find out just who is using LAST. I'm still tempted to disable LAST altogether in our Cyrus installation given the potential for unpleasant surprises. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.
Re: Problem with IMAP server suddenly hanging
On Tue, 24 Jun 2003, Rob Tanner wrote: > I just installed cyrus-imapd-2.1.13 on a Solaris machine. The > conversion from the old MessagingDirect server was even easier than > expected. The server started up just fine and then hung. I kill the > master and restart and everything is okasy for a while and then it > hangs (imap requests connect but nothing happens). Also, I've noticed > that "ctl_cyrusdb -c" is also hung at the time (truss shows the process > in a constant sleep state from which it doesn't awake until master is > killed). I think ctl_cyrusdb's sleep state is a symptom, not the cause > since it also runs at other times just fine. > > Any ideas on what I need to look at/for? Is there anything interesting in the logs? There seems to be a bad interaction between Cyrus and DB 4.1.25 which might explain what you are seeing. Its been fixed in CVS. The other common explanation for hangs is /dev/random running out of entropy, but I don't think that would affect ctl_cyrusdb. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.
Naive Sieve question
I'm still relatively new to Sieve (we've been using Exim filter files up to now), so I suspect I'm missing something obvious here. What is the recommended way to filter mail delivery reports? I'm the postmaster for a 25k user mail system, and really don't want all of the bounces in my primary INBOX. Neither of the two obvious approaches: if envelope :is "from" "" { fileinto "bounces"; stop; } or: if header :is "return-path" "<>" { fileinto "bounces"; stop; } seem to work. So far the best that I have come up with is: if not envelope :contains "from" "@" { fileinto "bounces"; stop; } which works (at least in our environment), but isn't very pretty. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.
Re: non-existent msgids & duplicate delivery suppression
On Fri, 27 Jun 2003, Stephen Grier wrote: > I have noticed that this has been happening on other occasions, where > lmtpd does a duplicate_check and then a duplicate_mark on zero length > message-ids. This raises the possibility that the server is suppressing > messages that are not duplicates, but merely have no message-id header. > > Has anyone else seen this happening?. If this continues to happen we may > have to disable duplicate delivery suppression. This seems quite likely. If a message doesn't contain a Message-ID, your mail system should really add one before it passes it on to Cyrus. Exim does this automatically. I don't know about other MTAs. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.
Re: non-existent msgids & duplicate delivery suppression
On Fri, 27 Jun 2003, Stephen Grier wrote: > We use Exim here. The messages do have a Message-Id: header, but with an > empty value. Eeep. That's no fun at all. > I assume Exim will not add or alter the msg-id in this case. I suspect this is the case, though arguably Exim should do something about this case. I'll have a chat with Philip the next time that I see him. -- David Carter Email: [EMAIL PROTECTED] University Computing Service,Phone: (01223) 334502 New Museums Site, Pembroke Street, Fax: (01223) 334679 Cambridge UK. CB2 3QH.