Hi, I'm trying to understand what approach bash takes to async-signal safety in its design.
Generally, programs that use signals (such as SIGCHLD or SIGALRM) must make sure that (a) they do not access state (such as variables) from within the signal handler if such state could also be accessed from the main control flow while a signal could be delivered and the 2 accesses could conflict in some way. (b) they do not call POSIX functions from a signal handler if it is possible for those functions to be called from the main control flow while a signal can be delivered, unless the function is marked as async-signal safe by POSIX, which does not include stdio functions such as printf or fprintf. Looking at the bash 5.0 code, I see some comments in the code about strategies to protect the jobs array and other data structures from arriving SIGCHLD signals, but I have questions about, for instance, these: - printable_job_status uses a 'static' variable "temp". However, printable_job_status is called during the execution of the builtin command "jobs" and here (I believe at least) without blocking or queuing SIGCHLD. Therefore, if set -b is set, it could be reentered if a child process exits at that time. This could clobber 'temp'. - If set -b is set, calls to notify_job_status from the SIGCHLD handler may invoke printf or fprintf(stderr,), which is also called on builtin paths, for instance, when executing just "set" and listing all variables/functions. - The SIGALRM handler in eval.c calls printf and fflush(stdout), even though SIGALRM doesn't appear to be blocked elsewhere where printf() is called. (These 3 things were those I was able to spot in a few minutes, it is not meant to be an exhaustive list.) If my observations were correct, the existing code may not be async-signal safe. The path of calling printf() on a SIGCHLD path (if it exists) is particularly concerning because it is prone to deadlocks <http://www.cs.cmu.edu/afs/cs/academic/class/15213-s19/www/code/15-ecf-signals/signaldeadlock.c> due to the acquisition of a lock in stdio at least on GNU/Linux systems. This might actually show up under stress testing or in daily use. Printing to stderr may be less so in my experience, perhaps because stderr is unbuffered, but it would still violate the POSIX guarantees of those functions. Thanks. - Godmar