On Tue, Jan 24, 2023 at 10:25:11AM +0600, NRK wrote:
> On Mon, Jan 23, 2023 at 08:05:19PM +0100, Hiltjo Posthuma wrote:
> > Do you perhaps also have some simple way to reproduce where it causes
> > issues in
> > some real world use-case?
> >
> > Ideally some command or script to reproduce the issue.
>
> To trigger the issue, you need to have 3 conditions met:
>
> 1. Recieve a SIGCHLD.
> 2. `waitpid` failing and spoiling the `errno`.
> 3. Being unlucky that the resuming code gets affected by the `errno`.
>
> Here's a simple dummy program demonstrating the issue. Compile it, run
> it and then send some SIGCHLDs to it:
>
> [/tmp]~> cat sigtest.c
> #include <stdlib.h>
> #include <signal.h>
> #include <errno.h>
> #include <stdio.h>
> #include <sys/wait.h>
>
> void
> sigchld(int unused)
> {
> if (signal(SIGCHLD, sigchld) == SIG_ERR)
> _Exit(1);
> while (0 < waitpid(-1, NULL, WNOHANG));
> }
>
> int main(void)
> {
> sigchld(0);
> while (1) {
> errno = 0;
> char *p = malloc(8);
> if (p != NULL && errno) {
> perror("sigtest: ");
> return 1;
> }
> free(p);
> }
> }
> [/tmp]~> cc -o sigtest sigtest.c
> [/tmp]~> ./sigtest &
> [1] 9363
> [/tmp]~> while pidof sigtest >/dev/null; do kill -s SIGCHLD $(pidof
> sigtest); done
> sigtest: : No child processes
>
> If you're asking for `dwm` in specific - then trigger condition 2 is
> difficult (impossible ??) since the window manager will typically have
> some children. But if you insert the following, to simulate failure:
>
> while (0 < waitpid(-1, NULL, WNOHANG));
> + errno = ECHILD;
>
> then the issue is easily reproducable by sending some SIGCHLDs to dwm
> (in quick succession in order to maximize chances of triggering
> condition 3):
>
> var=$(pidof dwm); while :; do kill -s SIGCHLD $var; done
>
> And dwm will crash with a XIO error with in a couple seconds on my
> system.
>
> - NRK
>
Thanks, I can see how it can be an issue clobbering errno potentially etc.
Although of course checking errno on a success condition is a bit wonky in this
test case.
What about a simplified version of a patch below, it should do a similar thing:
>From d628a1081a2403e3b703b2f1acbff1f0ca148956 Mon Sep 17 00:00:00 2001
From: Hiltjo Posthuma <[email protected]>
Date: Tue, 24 Jan 2023 20:53:56 +0100
Subject: [PATCH] simplify ignoring SIGCHLD
... and handle in a way with less possible race-conditions.
https://pubs.opengroup.org/onlinepubs/9699919799/functions/_Exit.html#tag_16_01_03_01
"[XSI] [Option Start] If the parent process of the calling process has set its
SA_NOCLDWAIT flag or has set the action for the SIGCHLD signal to SIG_IGN:
[...] The lifetime of the calling process shall end immediately. If
SA_NOCLDWAIT is set, it is implementation-defined whether a SIGCHLD signal is
sent to the parent process. [...]"
---
dwm.c | 12 ++----------
1 file changed, 2 insertions(+), 10 deletions(-)
diff --git a/dwm.c b/dwm.c
index 03baf42..b3c1fc6 100644
--- a/dwm.c
+++ b/dwm.c
@@ -205,7 +205,6 @@ static void setmfact(const Arg *arg);
static void setup(void);
static void seturgent(Client *c, int urg);
static void showhide(Client *c);
-static void sigchld(int unused);
static void spawn(const Arg *arg);
static void tag(const Arg *arg);
static void tagmon(const Arg *arg);
@@ -1545,7 +1544,8 @@ setup(void)
Atom utf8string;
/* clean up any zombies immediately */
- sigchld(0);
+ if (signal(SIGCHLD, SIG_IGN) == SIG_ERR)
+ die("can't ignore SIGCHLD:");
/* init screen */
screen = DefaultScreen(dpy);
@@ -1638,14 +1638,6 @@ showhide(Client *c)
}
}
-void
-sigchld(int unused)
-{
- if (signal(SIGCHLD, sigchld) == SIG_ERR)
- die("can't install SIGCHLD handler:");
- while (0 < waitpid(-1, NULL, WNOHANG));
-}
-
void
spawn(const Arg *arg)
{
--
2.39.1
--
Kind regards,
Hiltjo