Recently a mailserver here got into a bit of trouble. It's a debian machine running a 2.4.25 kernel, cyrus 2.2.5 (not the debian package), with exim 4.34 as the MTA. The symptoms seemed to echo the process accounting bug from 2.1, as at various intervals there'd be no lmtpd processes available for deliveries, and the master process would not spawn any more. Looking at what some debugging information, it seems that the master process became convinced that there were more ready_workers than there were nactive processes, and so when the last of the lmtpd died naturally, it saw no need to spawn any more. I've a kludge in place at the moment that will spawn a new service based on both ready_workers and nactive, instead of just ready_workers, but I'd like a more stable solution.
It may be related when deliveries were happening, and lmtpd was running, both deliver and lmtpd were operating slowly, as they were both blocking on the unix domain socket used to communicate. Even with more idle lmtpd processes than deliver processes, deliver was still blocking trying to access the socket. This was only fixed by rebooting the machine. I'm unsure exactly what's going on, as this is a production mail system, and I'm unable to do much tinkering with it. Any advice on what's happened and what can be done to stop it happening again is appreciated. Thanks, Simon. --- Cyrus Home Page: http://asg.web.cmu.edu/cyrus Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html