Kurt Roeckx wrote: > It seems that parallel can return -1 in most of the cases for no > good reason. All childs returned 0, and there were no other > errors.
Do you have a test case? > The code looks like: > int wait_for_child(int options) { > id_t id_ignored = 0; > siginfo_t infop; > > infop.si_pid = 0; > waitid(P_ALL, id_ignored, &infop, WEXITED | options); > if (infop.si_pid == 0) > return -1; /* Nothing to wait for */ > if (infop.si_code == CLD_EXITED) > return infop.si_status; > return 1; > } > > And this return value is or'd between all calls. > > The first thing that looks wrong to this is that the return > value of waitid() is not checked. The strace shows that > I have atleast 2 calls to it that return -1. It most > likely still contains a pid of 0 in that case. You really > don't want to return -1 for that. si_pid will only be nonzero if waitid succeeded. If waitid fails, the code returns -1, and this is ORed to the returncode. Since the returncode is an int, it then becomes 255, so that's what parallel will exit if it fails to wait for a child. > An other reason that you could get that -1 is that you called > it with WNOHANG. In that case you check for the value > being smaller than 0 and don't add it. But I think values > smaller than 0 are valid, and so you can't tell the difference > between a child that exited and returning because you don't > want to wait. AFAIK a child process cannot really exit nonzero. Process exit codes are limited to 0..255, and exit(-1) == 255. While the lack of explicit signed ints in the code is a bit confusing, wait_for_child() can return a -1 that is distinguishable from a 255 by the code that tests that. > I think the code will also behave wrong in case the child > did not call exit() itself but instead got killed or something. > In that case infop.si_code can be != CLD_EXITED, and you'd > want to consider that a reason to make parallel return an > non-zero exit code. Returning a non-zero exit code of 1 is the fall-through case at the end of the function, and seems to work ok: j...@gnu:~>./segfault zsh: segmentation fault ./segfault j...@gnu:~>parallel segfault -- 1 2 3 zsh: exit 1 parallel segfault -- 1 2 3 -- see shy jo
signature.asc
Description: Digital signature