Encountered the following panic:

panic: kernel diagnostic assertion "(p->p_flag & P_TIMEOUT) == 0" failed: file 
"/usr/src/sys/kern/kern_synch.c", line 373
Stopped at      db_enter+0x10:  popq    %rbp
    TID    PID    UID     PRFLAGS     PFLAGS  CPU  COMMAND
 423109  57118     55         0x3          0    2  link
 330695  30276     55         0x3          0    3  link
* 46366  85501     55      0x1003  0x4080400    1  link
 188803  85501     55      0x1003  0x4082000    0K link
db_enter() at db_enter+0x10
panic(ffffffff81f25d2b) at panic+0xbf
__assert(ffffffff81f9a186,ffffffff81f372c8,175,ffffffff81f87c6c) at 
__assert+0x25
sleep_setup(ffff800022d64bf8,ffff800022d64c98,20,ffffffff81f66ac6,0) at 
sleep_setup+0x1d8
cond_wait(ffff800022d64c98,ffffffff81f66ac6) at cond_wait+0x46
timeout_barrier(ffff8000228a28b0) at timeout_barrier+0x109
timeout_del_barrier(ffff8000228a28b0) at timeout_del_barrier+0xa2
sleep_finish(ffff800022d64d90,1) at sleep_finish+0x16d
tsleep(ffffffff823a5130,120,ffffffff81f0b730,2) at tsleep+0xb2
sys_nanosleep(ffff8000228a27f0,ffff800022d64ea0,ffff800022d64ef0) at 
sys_nanosleep+0x12d
syscall(ffff800022d64f60) at syscall+0x374

The panic is a regression of sys/kern/kern_timeout.c r1.84. Previously,
soft-interrupt-driven timeouts could be deleted synchronously without
blocking. Now, timeout_del_barrier() can sleep regardless of the type
of the timeout.

It looks that with small adjustments timeout_del_barrier() can sleep
in sleep_finish(). The management of run queues is not affected because
the timeout clearing happens after it. As timeout_del_barrier() does not
rely on a timeout or signal catching, there should be no risk of
unbounded recursion or unwanted signal side effects within the sleep
machinery. In a way, a sleep with a timeout is higher-level than
one without.

Note that endtsleep() can run and set P_TIMEOUT during
timeout_del_barrier() when the thread is blocked in cond_wait().
To avoid unnecessary atomic read-modify-write operations, the clearing
of P_TIMEOUT could be conditional, but maybe that is an unnecessary
optimization at this point.

While it should be possible to make the code use timeout_del() instead
of timeout_del_barrier(), the outcome might not be outright better. For
example, sleep_setup() and endtsleep() would have to coordinate so that
a late-running timeout from previous sleep cycle would not disturb the
new cycle.

To test the barrier path reliably, I made the code call
timeout_del_barrier() twice in a row. The second call is guaranteed
to sleep. Of course, this is not part of the patch.

OK?

Index: kern/kern_synch.c
===================================================================
RCS file: src/sys/kern/kern_synch.c,v
retrieving revision 1.187
diff -u -p -r1.187 kern_synch.c
--- kern/kern_synch.c   13 May 2022 15:32:00 -0000      1.187
+++ kern/kern_synch.c   5 Jun 2022 05:04:45 -0000
@@ -370,8 +370,8 @@ sleep_setup(struct sleep_state *sls, con
        p->p_slppri = prio & PRIMASK;
        TAILQ_INSERT_TAIL(&slpque[LOOKUP(ident)], p, p_runq);
 
-       KASSERT((p->p_flag & P_TIMEOUT) == 0);
        if (timo) {
+               KASSERT((p->p_flag & P_TIMEOUT) == 0);
                sls->sls_timeout = 1;
                timeout_add(&p->p_sleep_to, timo);
        }
@@ -432,13 +432,12 @@ sleep_finish(struct sleep_state *sls, in
 
        if (sls->sls_timeout) {
                if (p->p_flag & P_TIMEOUT) {
-                       atomic_clearbits_int(&p->p_flag, P_TIMEOUT);
                        error1 = EWOULDBLOCK;
                } else {
-                       /* This must not sleep. */
+                       /* This can sleep. It must not use timeouts. */
                        timeout_del_barrier(&p->p_sleep_to);
-                       KASSERT((p->p_flag & P_TIMEOUT) == 0);
                }
+               atomic_clearbits_int(&p->p_flag, P_TIMEOUT);
        }
 
        /* Check if thread was woken up because of a unwind or signal */

Reply via email to