Both copy_process() and alloc_pid() do the same PIDNS_ADDING check. The reasons for these checks, and the fact that both are necessary, are not immediately obvious. Add the comments.
Signed-off-by: Oleg Nesterov <[email protected]> --- kernel/fork.c | 6 +++++- kernel/pid.c | 5 +++++ 2 files changed, 10 insertions(+), 1 deletion(-) diff --git a/kernel/fork.c b/kernel/fork.c index 544fe1b43d88..7cfa8addc080 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2392,7 +2392,11 @@ __latent_entropy struct task_struct *copy_process( rseq_fork(p, clone_flags); - /* Don't start children in a dying pid namespace */ + /* + * If zap_pid_ns_processes() was called after alloc_pid(), the new + * child missed SIGKILL. If current is not in the same namespace, + * we can't rely on fatal_signal_pending() below. + */ if (unlikely(!(ns_of_pid(pid)->pid_allocated & PIDNS_ADDING))) { retval = -ENOMEM; goto bad_fork_core_free; diff --git a/kernel/pid.c b/kernel/pid.c index 1a0d2ac1f4a9..082a3c4a053f 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -317,6 +317,11 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *arg_set_tid, * * This can't be done earlier because we need to preserve other * error conditions. + * + * We need this even if copy_process() does the same check. If two + * or more tasks from parent namespace try to inject a child into a + * dead namespace, one of free_pid() calls from the copy_process() + * error path may try to wakeup the possibly freed ns->child_reaper. */ retval = -ENOMEM; if (unlikely(!(ns->pid_allocated & PIDNS_ADDING))) -- 2.52.0

