Package: dpkg
Version: 1.14.31

I've been looking into an issue on our product where a network restart was 
taking a long time to perform.
I tracked it down to when the ssh server is stopped in /etc/init.d/ssh where is 
passed a restart command.
For a restart sshd uses start-stop-daemon with --stop --retry 30 ... --pidfile 
/var/run/sshd.pid
If I remove the --retry option the network restarts within a few seconds, with 
the --retry option is takes around a minute.

After looking through the source I see that if --retry 30 is set the schedule 
is set to
1) Send SIGTERM
2) Wait 30 Seconds polling to see if the process has stopped
3) If still running send SIGKILL
4) Again wait 30 seconds and poll.

Following the code I see that end up in do_stop() which calls do_findprocs() 
and then to do_pidfile() which then uses the check()
function to determine whether the process is running by populating the found 
list.
But looking through the check() function it handles --exec --user and --name but doesn't seem to handle pidfile.

I cobbled together the following code and it seems to have cured the problem, 
although I have to admit I haven't
tested it extensively yet.

I added this funcion:
static int
pid_is_pidfile(pid_t pid, const char *pidfile)
{
    FILE *f;
    int c;
    int pidfromfile;

    f = fopen(pidfile, "r");
    if (!f)
        return 0;
    else {
        /* pidfile exists, check to see if it contains a different pid */
        if (fscanf(f, "%d",&pidfromfile) == 1) {
            if(pid != pidfromfile) {
                fclose(f);
                return 0;
            }
        }
    }
    /* pidfile exists and contains the pid we are checking */
    fclose(f);
    return 1;
}

....

Then changed check to use it:
static void
check(pid_t pid)
{
#if defined(OSLinux) || defined(OShpux)
    if (execname&&  !pid_is_exec(pid,&exec_stat))
        return;
#elif defined(OSHURD) || defined(OSFreeBSD) || defined(OSNetBSD)
    /* Let's try this to see if it works */
    if (execname&&  !pid_is_cmd(pid, execname))
        return;
#endif
    if (pidfile&&  !pid_is_pidfile(pid, pidfile))
        return;
    if (userspec&&  !pid_is_user(pid, user_id))
        return;
    if (cmdname&&  !pid_is_cmd(pid, cmdname))
        return;
    if (start&&  !pid_is_running(pid))
        return;
    push(&found, pid);
}


I would appreciate it if someone could confirm this is a bug and that the fix 
is ok and hasn't broken anything else.

I'm running Debian lenny on an embedded arm board with a 2.6 linux OS.

Best Regards,
Martin.


--
Martin Townsend
Power*Oasis*




--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to