I'm looking at an issue on master where the process sometimes hangs on shutdown 
due to an apparently lost SIGCHLD during the shutdown process.

See my report here:  https://github.com/cyrusimap/cyrus-imapd/issues/5309.  The 
issue appeared when I upgraded from 3.4.x to 3.8.4, but I suspect it's 
timing-related, and not directly related to changes made in the code.

I'm running Cyrus imapd 3.8.5, and I end up with a child process (or two, or 
more) in a zombie state:

cyrus    10934   0.1  0.0       0       0  -  Z    08:50      0:00.02 <defunct>

These are usually imapd or pop3d processes, and prevent master from shutting 
down.

Using a lot of additional syslog messages, I see that SIGCHLD is successfully 
handled for most of the child processes during shutdown, but the last one (or 
more) sometimes doesn't seem to be delivered (or perhaps doesn't trigger the 
handler) on master.  

pselect continues to be called regularly, and so does child_janitor, but master 
never exits as it's waiting for the child's signal to trigger reap_child.

I can get the shutdown working reliably by explicitly calling reap_child in the 
main loop when shutdown has been requested:
        if (in_shutdown)
          reap_child();

        gettimeofday(&now, 0);
        child_janitor(now);
        do_prom_report(now);

It's possible that this is happening during normal operation too, but that it's 
not noticed because subsequent SIGCHLD signals from other exiting children 
trigger reap_child, which cleans up all children.

This fix works, but is perhaps not an elegant way to fix the problem.  Does 
anybody have any suggestions on a better fix?  I'm happy to test fixes.

Note that I'm running on FreeBSD 13.4.  Дилян Палаузов reported a similar 
issue, presumably on Linux 
(https://cyrus.topicbox.com/groups/devel/Tddff7ee5049ffbc4). 
------------------------------------------
Cyrus: Devel
Permalink: 
https://cyrus.topicbox.com/groups/devel/Te9b1b54f74d1b828-M78426cae7fb8b33a8952d98f
Delivery options: https://cyrus.topicbox.com/groups/devel/subscription

Reply via email to