merge 157305 352554
thanks

I guess this is the same bug.  It seems really hard to believe that
this race happens as often as I've seen it:
       
     1          if (!found)
     2                  return;
       
     3          clear(&killed);
       
     4          for (p = found; p; p = p->next) {
     5                  if (testmode) {
     6                          printf("Would send signal %d to %d.\n",
     7                                 signal_nr, p->pid);
     8                          (*n_killed)++;
     9                  } else if (kill(p->pid, signal_nr) == 0) {
       

If there are no processes at 1), the function returns, and s-s-d
ends.  3) just clears a list (but I note that it also calls free).  4)
is a loop, which will only iterate over one object (udev) in my
(common) case.  Apparently 9) is what causes the warning, since s-s-d
doesn't call kill anywhere else.

So, it seems safe to say that the process is dying while other
processes are searched, since the only other place a syscall could be
happening is in free() (I'm assuming that very short userspace loops
have negligable delay, and that the frequency with which SSD processes
being interrupted by the scheduler during those loops is also
negligable).

Loic Minier's two patches do something completely different, and I'm
not sure I like it.  They cause ssd --pidfile to emit no warning when
the pidfile includes a PID which doesn't exist; that is different from
my problem, and I think that a warning *is* appropriate in this
scenario (and if the caller doesn't want that to cause a nonzero
retval, then they should use --oknodo).

The existing patch by Samuel Thibault should mostly-fix this, by
ignnoring processes which die before a SIG_0 was sent.  This makes
sense, since SIG_0 is just to check if the process is alive.  This
patch still allows a false-positive warning, in the case that the
process dies before the *requested* signal is sent, but this would
happen much less frequenly.  Actually, only when a PID is explicitly
fed to SSD is a warning interesting at all, and Loic's patch removes
this warning.  Another approach would be to add an explicit check for
the --pidfile case (and the wishlist --pid case) that the given PID
exists, but in every other case to not warn when either: kill()!=-1 ||
errno==ESRCH (since a process disappearing can actually be *expected*,
afterall, we are calling "kill" with default arguments "SIGTERM" and
"SIGKILL").  This would allow for no false positive warnings.

Justin


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

Reply via email to