followup: stuck lmtpd processes and sieve errors

2003-11-20 Thread Andrew Morgan
I think I have linked two of my problems together... A long while back, I reported strange sieve errors in my logs, such as: Nov 20 12:52:17 mail1 lmtpd[20900]: sieve parse error for howerja: line 1: parse error, unexpected STRING Where the sieve in question is just: redirect "[EMAIL PROTECTE

Re: followup: stuck lmtpd processes

2003-09-29 Thread Etienne Goyer
Sorry, I will have to wait til 2.1.16 to test it. I can't just plug the fud.c from CVS and compile it, and I am really too busy these day to make a full checkout from CVS and test it. I'll report my experience with 2.1.16, if and when it come out. On Wed, Sep 24, 2003 at 01:24:11PM -0400, Etien

[PATCH] lockbreak using sigalarm (was followup: stuck lmtpd processes)

2003-09-26 Thread Henrique de Moraes Holschuh
There is now a preliminary patch available. http://bugzilla.andrew.cmu.edu/show_bug.cgi?id=1177 Please test the second patch, and report back using the bug tracking system. -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the

Re: followup: stuck lmtpd processes

2003-09-26 Thread Henrique de Moraes Holschuh
On Fri, 26 Sep 2003, Tom wrote: > On Wed, 24 Sep 2003, Henrique de Moraes Holschuh wrote: > > With SYSV you will get the interrupted system call, unless you tell it > > somehow not to do it (the SA_RESTART stuff). If we are to accomodate the > > BSDs, we can: > > 1. Let them have the short end o

Re: followup: stuck lmtpd processes

2003-09-26 Thread Tom
On Wed, 24 Sep 2003, Henrique de Moraes Holschuh wrote: > On Wed, 24 Sep 2003, Etienne Goyer wrote: > > On Wed, Sep 24, 2003 at 11:27:46AM -0400, Rob Siemborski wrote: > > > However, I have looked into this and to my surprise, Linux is indeed > > > restarting the system calls instead of returning

Re: followup: stuck lmtpd processes

2003-09-25 Thread Rob Siemborski
On Thu, 25 Sep 2003, Etienne Goyer wrote: > However, the man page is wrong about EINTR at least as far as RedHat 7.x > is concerned. In a murder environnement, when following a referral : No it isn't wrong. The problem is signals that are configured via signal() instead of sigaction(). On Linu

Re: followup: stuck lmtpd processes

2003-09-25 Thread Etienne Goyer
On Wed, Sep 24, 2003 at 08:01:11PM +0100, Patrick Welche wrote: > I don't understand. The only alarm() business I can see in imap/fud.c > is around recvfrom which at least according to its man page says > > [EINTR]The receive was interrupted by delivery of a signal >

Re: followup: stuck lmtpd processes

2003-09-24 Thread Scott Adkins
Well, that could definitely be a problem... Next time we see a lock problem occur, I will look based on the information below to see if it is really a lock problem on the quota file. Thanks, Scott --On Wednesday, September 24, 2003 12:32 PM -0700 Andrew Morgan <[EMAIL PROTECTED]> wrote: On Wed,

Re: followup: stuck lmtpd processes

2003-09-24 Thread Andrew Morgan
On Wed, 24 Sep 2003, Scott Adkins wrote: > When looking at what file the processes are all waiting to get a lock on, > it usually turns out to be the cyrus.header file and not the quota file. > Is this still the same bug described by Rob on bugzilla? Does it have to > be the quota file? > > Als

Re: followup: stuck lmtpd processes

2003-09-24 Thread John C. Amodeo
...until your system runs out of available open files... Then the real fun begins... :-) -John Andrew Morgan wrote: > On Wed, 24 Sep 2003, John Wade wrote: > > > The patch I wrote still might help you since it would prevent an > > individual user's problem from taking down the mail system. Th

Re: followup: stuck lmtpd processes

2003-09-24 Thread Andrew Morgan
On Wed, 24 Sep 2003, John C. Amodeo wrote: > ...until your system runs out of available open files... > > Then the real fun begins... :-) > > -John [EMAIL PROTECTED] tools]# cat /proc/sys/fs/file-max 209708 I'm in a lot of trouble if I've got 209708 files open. :) Andy

Re: followup: stuck lmtpd processes

2003-09-24 Thread John C. Amodeo
Andy, Its happen to me before... Don't think it can't... That's all I'm saying... -John Andrew Morgan wrote: > On Wed, 24 Sep 2003, John C. Amodeo wrote: > > > ...until your system runs out of available open files... > > > > Then the real fun begins... :-) > > > > -John > > [EMAIL PROTECTED]

Re: followup: stuck lmtpd processes

2003-09-24 Thread Andrew Morgan
On Wed, 24 Sep 2003, John Wade wrote: > The patch I wrote still might help you since it would prevent an > individual user's problem from taking down the mail system. The user's > mailbox would remain inaccessible, but the lmtpd processes attempting > delivery would exit with errors and mail d

Re: followup: stuck lmtpd processes

2003-09-24 Thread Patrick Welche
On Wed, Sep 24, 2003 at 02:20:50PM -0300, Henrique de Moraes Holschuh wrote: > On Wed, 24 Sep 2003, Etienne Goyer wrote: > > On Wed, Sep 24, 2003 at 11:27:46AM -0400, Rob Siemborski wrote: > > > However, I have looked into this and to my surprise, Linux is indeed > > > restarting the system calls i

Re: followup: stuck lmtpd processes

2003-09-24 Thread Henrique de Moraes Holschuh
On Wed, 24 Sep 2003, Etienne Goyer wrote: > On Wed, Sep 24, 2003 at 11:27:46AM -0400, Rob Siemborski wrote: > > However, I have looked into this and to my surprise, Linux is indeed > > restarting the system calls instead of returning with EINTR. However, the > > answer here is to set up the alarm(

Re: followup: stuck lmtpd processes

2003-09-24 Thread Etienne Goyer
Thanks. I'll test it by the end of the week, and report. On Wed, Sep 24, 2003 at 01:18:12PM -0400, Rob Siemborski wrote: > On Wed, 24 Sep 2003, Etienne Goyer wrote: > > > > I'll work on fixing fud shortly (its using signal() and it should be > > > using sigaction()). > > > > The included patch a

Re: followup: stuck lmtpd processes

2003-09-24 Thread Rob Siemborski
On Wed, 24 Sep 2003, Etienne Goyer wrote: > > Something that works in Linux, sure. Something that works in broken Linux? > > No. Fix the breakage in Linux, instead. That's our strenght, and I *will* > > stick to it as a Debian maintainer. > > While I agree with you on a technical level and admi

Re: followup: stuck lmtpd processes

2003-09-24 Thread Rob Siemborski
On Wed, 24 Sep 2003, Etienne Goyer wrote: > > I'll work on fixing fud shortly (its using signal() and it should be > > using sigaction()). > > The included patch against 2.1.13 work for me. This sort of thing won't work for file locking. I've just committed a patch to fud that uses sigaction() [

Re: followup: stuck lmtpd processes

2003-09-24 Thread Etienne Goyer
On Wed, Sep 24, 2003 at 12:57:37PM -0300, Henrique de Moraes Holschuh wrote: > I did check ALL the documentation already, and ALL of it says that sigalarm > MUST interrupt the syscall, and that it HAS to return EINTR. So, it is a > bug. So, it needs to be squashed, and people have to either patch

Re: followup: stuck lmtpd processes

2003-09-24 Thread Etienne Goyer
On Wed, Sep 24, 2003 at 11:27:46AM -0400, Rob Siemborski wrote: > However, I have looked into this and to my surprise, Linux is indeed > restarting the system calls instead of returning with EINTR. However, the > answer here is to set up the alarm() handler with sigaction without > setting SA_REST

Re: followup: stuck lmtpd processes

2003-09-24 Thread Henrique de Moraes Holschuh
On Wed, 24 Sep 2003, Rob Siemborski wrote: > However, I have looked into this and to my surprise, Linux is indeed > restarting the system calls instead of returning with EINTR. However, the > answer here is to set up the alarm() handler with sigaction without > setting SA_RESTART, not to jump thro

Re: followup: stuck lmtpd processes

2003-09-24 Thread Henrique de Moraes Holschuh
On Wed, 24 Sep 2003, Etienne Goyer wrote: > On Wed, Sep 24, 2003 at 11:13:06AM -0300, Henrique de Moraes Holschuh wrote: > > It is not a general solution when you hit glibc/kernel bugs, but I can > > certainly live with it IF I manage to track down a version of glibc and > > kernel that won't deadl

Re: followup: stuck lmtpd processes

2003-09-24 Thread Rob Siemborski
On Wed, 24 Sep 2003, Etienne Goyer wrote: > The obvious solution is to not use alarm() to interrupt blocking > syscall, but to use non-blocking call with select() instead. I > am not a very proficient C Unix programmer, so maybe this suggestion > make no sense. However, in the case of the bug wi

Re: followup: stuck lmtpd processes

2003-09-24 Thread Etienne Goyer
On Wed, Sep 24, 2003 at 11:13:06AM -0300, Henrique de Moraes Holschuh wrote: > It is not a general solution when you hit glibc/kernel bugs, but I can > certainly live with it IF I manage to track down a version of glibc and > kernel that won't deadlock, that we can recommend. Either that, or allow

Re: followup: stuck lmtpd processes

2003-09-24 Thread Rob Siemborski
On Wed, 24 Sep 2003, Scott Adkins wrote: > Also, when we find the specific imaps process that happens to have the > cyrus.header lock file opened for writing and has it locked, if we kill > it off, we find that the write lock goes to another imaps process or to > one of the LMTP processes and gets

Re: followup: stuck lmtpd processes

2003-09-24 Thread Scott Adkins
I just wanted to add something to this discussion... First of all, we see the problem in Tru64 as well. When we upgraded to the 2.2 series, we put in the locking patch that John described below. This has helped us, but the locking problem has *not* gone away... in fact, it does a better job of *

Re: followup: stuck lmtpd processes

2003-09-24 Thread Henrique de Moraes Holschuh
On Wed, 24 Sep 2003, Rob Siemborski wrote: > On Wed, 24 Sep 2003, Henrique de Moraes Holschuh wrote: > > Agreed, but if we are going to keep the blocking-on-lock behaviour (and I > > know we are ;-)), we really, really should have a way to timeout and kill > > the process if that lock does not rele

Re: followup: stuck lmtpd processes

2003-09-24 Thread Rob Siemborski
On Wed, 24 Sep 2003, Henrique de Moraes Holschuh wrote: > Agreed, but if we are going to keep the blocking-on-lock behaviour (and I > know we are ;-)), we really, really should have a way to timeout and kill > the process if that lock does not release after a while. > > Resilience IS necessary...

Re: followup: stuck lmtpd processes

2003-09-24 Thread Henrique de Moraes Holschuh
On Wed, 24 Sep 2003, Rob Siemborski wrote: > think about it. The kernel is responsible for waking processes up when > they are blocking on a lock and it becomes available. If this isn't > happening (causing the need to do locks in a nonblocking fashion) then > something is wrong with the *kernel*

Re: followup: stuck lmtpd processes

2003-09-24 Thread Etienne Goyer
Hi, I don't have time this morning to have a look at your patch and understand the issue, but it reminded me of another bug I found a few months ago. It may or may not relate to the problem you are fixing. I just think you might be interested in knowing. It's the timeout part of your problem th

Re: followup: stuck lmtpd processes

2003-09-24 Thread Rob Siemborski
On Tue, 23 Sep 2003, Andrew Morgan wrote: > I think your patch would fix the problem where are lot of processes are > contending for a lock (by making them retry), but it wouldn't help if a > single process keeps the lock indefinately. I agree. The whole act of retrying for a lock is pretty sill

Re: followup: stuck lmtpd processes

2003-09-24 Thread John Wade
Oooppss. Sorry, my mailbox went temporarily over quota and the delivery of the original thread was deferred until after I had read and responded to the followup. It looks like the locking mechanism is working correctly here and the bug is really in the network timeout. (or in the implementati

Re: followup: stuck lmtpd processes

2003-09-23 Thread Andrew Morgan
On Tue, 23 Sep 2003, John Wade wrote: > Hi Andrew, > > I was the one who wrote the message you found. I finally came to the > conclusion that the flat file locking mechanism is somewhat broken in > Cyrus, but I was never a good enough C programmer to pin down what was > happening. (The mmap s

Re: followup: stuck lmtpd processes

2003-09-23 Thread John Wade
Hi Andrew, I was the one who wrote the message you found. I finally came to the conclusion that the flat file locking mechanism is somewhat broken in Cyrus, but I was never a good enough C programmer to pin down what was happening. (The mmap stuff makes it really tricky to debug.)I want

followup: stuck lmtpd processes

2003-09-23 Thread Andrew Morgan
Following up on my previous post about stuck lmtpd processes. I found this incredibly detailed faq at: http://www.faqchest.com/prgm/cyrus-l/cyrus-01/cyrus-0111/cyrus-011102/cyrus0023_33254.html This isn't exactly the same problem, but the steps on that page helped me figure out that they ar