On Mon, Jan 24, 2011 at 09:33:40AM -0800, Philip Guenther wrote: > The other thing I need to finish double checking is whether the nesting of > _thread_kern_in_sched vs _queue_signals is correct here: > > > @@ -481,6 +487,9 @@ _thread_kern_sched(struct sigcontext * s > > */ > > curthread = _get_curthread(); > > _thread_kern_in_sched = 0; > > + > > + /* Allow signals again. */ > > + _queue_signals = 0; > > > > /* run any installed switch-hooks */ > > if ((_sched_switch_hook != NULL) && > > ...or whether the order should be flipped.
You're right, there's a problem. This is tricky. If I'm not mistaken, with the current ordering, we get the following: A signal is caught in-between setting these global variables, i.e. they are: _thread_kern_in_sched == 0 && _queue_signals == 1 _thread_sig_handler() { if (sig == _SCHED_SIGNAL) { if (There are pending signals for the current thread, i.e. a signal was received while we were in the critical section protected by the patch) { _SCHED_SIGNAL is ignored, current thread will yield in _thread_kern_sig_undefer() } else { _thread_kern_sched() is called with _queue_signals == 1. It immediately sets _thread_kern_in_sched to 1. Signals are blocked earlier than usual during _thread_kern_sched(), but it eventually wants _queue_signals == 1 anyway. No biggie. A new thread might be scheduled. Either the interrupted thread is scheduled again right way, or another _SCHED_SIGNAL eventually causes _thread_kern_sched() to resume the interrupted thread. When the interrupted thread resumes it is still in _thread_kern_sched(), about to set _queue_signals to zero. But the global _thread_kern_in_sched is 1 from the entry to _thread_kern_sched() and will NOT be set to zero cause the thread has already done that! We did effectively jump within _thread_kern_in_sched(), skipping the part that sets _thread_kern_in_sched to zero again. Now we're stuck with _thread_kern_in_sched = 1. } } else { signal is queued because of _queue_signals == 1 } return to interrupted thread, which proceeds to set _queue_signals to zero, or to same or a newly scheduled thread in case sig == _SCHED_SIGNAL and there were no pending signals. } The other case is correct: _thread_kern_in_sched == 1 && _queue_signals == 0 _thread_sig_handler() { if (sig == _SCHED_SIGNAL) { signal is ignored because of _thread_kern_in_sched == 1; } else { _queue_signals = 1; signal is handled, possibly dispatched to application _queue_signals = 0; } return to interrupted thread, which proceeds to set _thread_kern_in_sched to zero. } New diff, also removing the volatile change cause we're not sure if it's necessary: Index: uthread/uthread_kern.c =================================================================== RCS file: /cvs/src/lib/libpthread/uthread/uthread_kern.c,v retrieving revision 1.36 diff -u -p -r1.36 uthread_kern.c --- uthread/uthread_kern.c 21 May 2007 16:50:36 -0000 1.36 +++ uthread/uthread_kern.c 24 Jan 2011 20:11:09 -0000 @@ -440,6 +440,12 @@ _thread_kern_sched(struct sigcontext * s _queue_signals = 0; } + /* + * Prevent the signal handler from fiddling with this + * thread before its state is set. + */ + _queue_signals = 1; + /* Make the selected thread the current thread: */ _set_curthread(pthread_h); curthread = pthread_h; @@ -480,6 +486,11 @@ _thread_kern_sched(struct sigcontext * s * before use. */ curthread = _get_curthread(); + + /* Allow signals again. */ + _queue_signals = 0; + + /* Done with scheduling. */ _thread_kern_in_sched = 0; /* run any installed switch-hooks */