Re: UC Davis Cyrus Incident September 2007

2007-10-18 Thread Robert Mueller
> Actually, I don't see a deadlock situation at all... I am guessing that > theorettically, it is possible... but the "ln -sf" option makes the > overwriting of the symlink an atomic action (as much as it can), which Not a "deadlock" situation, but a possible "file doesn't exist" error. In the Un

Re: UC Davis Cyrus Incident September 2007

2007-10-18 Thread Scott Adkins
--On Thursday, October 18, 2007 9:58 AM +0200 Pascal Gienger <[EMAIL PROTECTED]> wrote: Scott Adkins <[EMAIL PROTECTED]> wrote: Meanwhile, we hacked around this in a very cool way. We copied the imapd process 60 times (assuming average of 12,000 processes, shooting for 200 processes per exec

Re: UC Davis Cyrus Incident September 2007

2007-10-18 Thread Pascal Gienger
Scott Adkins <[EMAIL PROTECTED]> wrote: > Meanwhile, we hacked around this in a very cool way. We copied the imapd > process 60 times (assuming average of 12,000 processes, shooting for 200 > processes per executable, that is 60 individual executables). These were > named /usr/cyrus/bin/imapd_00

Re: UC Davis Cyrus Incident September 2007

2007-10-17 Thread Rob Mueller
could someone whip up a small test that could be used to check different operating systems (and filesystems) for this concurrancy problem? Not a bad idea. I was able to throw something together in about half an hour with perl. See attached. It requires the Benchmark, Time::HiRes and Sys::Mmap

Re: UC Davis Cyrus Incident September 2007

2007-10-17 Thread Blake Hudson
> > Meanwhile, we hacked around this in a very cool way. We copied the imapd > process 60 times (assuming average of 12,000 processes, shooting for 200 > processes per executable, that is 60 individual executables). These were > named /usr/cyrus/bin/imapd_001 through /usr/cyrus/bin/imapd_060. W

Re: UC Davis Cyrus Incident September 2007

2007-10-17 Thread David Lang
Omen Wild (University of California Davis) The root problem seems to be an interaction between Solaris' concept of global memory consistency and the fact that Cyrus spawns many processes that all memory map (mmap) the same file. Whenever any process updates any part of a memory ma

Re: UC Davis Cyrus Incident September 2007

2007-10-17 Thread Scott Adkins
--On Tuesday, October 16, 2007 3:39 PM -0700 Vincent Fox <[EMAIL PROTECTED]> wrote: Omen Wild (University of California Davis) The root problem seems to be an interaction between Solaris' concept of global memory consistency and the fact that Cyrus spawns many processes that all me

RE: UC Davis Cyrus Incident September 2007

2007-10-17 Thread Xue, Jack C
Thanks for sharing your story. There are quite a number of large Cyrus-IMAP installations around the world, especially in the Higher-Education industry. We did a mass e-mail migration last year from OpenVMS to Cyrus/Postfix on Linux 2.6. Comparing with UC-Davis, our systems have less activity as

Re: UC Davis Cyrus Incident September 2007

2007-10-17 Thread Nik Conwell
On Oct 17, 2007, at 1:36 PM, Andrew Morgan wrote: > On Tue, 16 Oct 2007, Vincent Fox wrote: > >> So here's the story of the UC Davis (no, not Berkeley) Cyrus >> conversion. > > [snip] > > This is a fascinating story, so please keep us all posted with your > findings! I second this. Thanks

Re: UC Davis Cyrus Incident September 2007

2007-10-17 Thread Andrew Morgan
On Tue, 16 Oct 2007, Vincent Fox wrote: > So here's the story of the UC Davis (no, not Berkeley) Cyrus conversion. [snip] > 5th STEP: Cyrus migration > > > The politics of educational environment is that you MUST do massive > changeouts like this during summer quarter.

Re: UC Davis Cyrus Incident September 2007

2007-10-17 Thread Pascal Gienger
Vincent Fox <[EMAIL PROTECTED]> wrote: > The root problem seems to be an interaction between Solaris' concept of > global memory consistency and the fact that Cyrus spawns many processes > that all memory map (mmap) the same file. Whenever any process updates > any part of a memory mapped file, S

Re: UC Davis Cyrus Incident September 2007

2007-10-16 Thread Bron Gondwana
On Wed, Oct 17, 2007 at 11:11:06AM +1000, Rob Mueller wrote: > One option you have is that rather than creating separate "Zones" in the OS, > you just create separate cyrus instances yourself. We do this at FastMail. > Basically we've partitions all our storage into 300G units, and each > partit

Re: UC Davis Cyrus Incident September 2007

2007-10-16 Thread Rob Mueller
Thanks for your post, it's always interesting to hear other peoples stories. > 1st STEP: Perdition mail-proxies > in a load-balanced pool and 2 can handle the load most days. Initially we If you have a chance, definitely think about changing from perdition to nginx. There's slightly more work

UC Davis Cyrus Incident September 2007

2007-10-16 Thread Vincent Fox
So here's the story of the UC Davis (no, not Berkeley) Cyrus conversion. We have about 60K active accounts, and another 10K that are forwards, etc. 10 UWash servers that were struggling to keep up with a load that was 2006 running around 2 million incoming emails a day, before spam dumpage, et