On Mon, Jan 24, 2011 at 09:33:40AM -0800, Philip Guenther wrote:
> The other thing I need to finish double checking is whether the nesting of 
> _thread_kern_in_sched vs _queue_signals is correct here:
> 
> > @@ -481,6 +487,9 @@ _thread_kern_sched(struct sigcontext * s
> >                      */
> >                     curthread = _get_curthread();
> >                     _thread_kern_in_sched = 0;
> > +
> > +                   /* Allow signals again. */
> > +                   _queue_signals = 0;
> >  
> >                     /* run any installed switch-hooks */
> >                     if ((_sched_switch_hook != NULL) &&
> 
> ...or whether the order should be flipped.

You're right, there's a problem. This is tricky.

If I'm not mistaken, with the current ordering, we get the following:

A signal is caught in-between setting these global variables,
i.e. they are: _thread_kern_in_sched == 0 && _queue_signals == 1

_thread_sig_handler()
{
        if (sig == _SCHED_SIGNAL) {
                if (There are pending signals for the current thread,
                    i.e. a signal was received while we were in the critical
                    section protected by the patch) {
                        _SCHED_SIGNAL is ignored, current thread will
                        yield in _thread_kern_sig_undefer()
                } else {
                        _thread_kern_sched() is called with
                        _queue_signals == 1. It immediately sets
                        _thread_kern_in_sched to 1.

                        Signals are blocked earlier than usual during
                        _thread_kern_sched(), but it eventually wants
                        _queue_signals == 1 anyway. No biggie.

                        A new thread might be scheduled.
                        Either the interrupted thread is scheduled again
                        right way, or another _SCHED_SIGNAL eventually
                        causes _thread_kern_sched() to resume the interrupted
                        thread.

                        When the interrupted thread resumes it is still
                        in _thread_kern_sched(), about to set
                        _queue_signals to zero.
                        But the global _thread_kern_in_sched is 1 from the
                        entry to _thread_kern_sched() and will NOT be set
                        to zero cause the thread has already done that!

                        We did effectively jump within _thread_kern_in_sched(),
                        skipping the part that sets _thread_kern_in_sched to
                        zero again. Now we're stuck with
                        _thread_kern_in_sched = 1.
                }
         } else {
                signal is queued because of _queue_signals == 1
         }

        return to interrupted thread, which proceeds to set
        _queue_signals to zero,
        or to same or a newly scheduled thread in case sig == _SCHED_SIGNAL
        and there were no pending signals.
}


The other case is correct:

_thread_kern_in_sched == 1 && _queue_signals == 0

_thread_sig_handler()
{
        if (sig == _SCHED_SIGNAL) {
                signal is ignored because of _thread_kern_in_sched == 1;
         } else {
                _queue_signals = 1;
                signal is handled, possibly dispatched to application
                _queue_signals = 0;
         }

        return to interrupted thread, which proceeds to set
        _thread_kern_in_sched to zero.
}


New diff, also removing the volatile change cause we're not sure if
it's necessary:

Index: uthread/uthread_kern.c
===================================================================
RCS file: /cvs/src/lib/libpthread/uthread/uthread_kern.c,v
retrieving revision 1.36
diff -u -p -r1.36 uthread_kern.c
--- uthread/uthread_kern.c      21 May 2007 16:50:36 -0000      1.36
+++ uthread/uthread_kern.c      24 Jan 2011 20:11:09 -0000
@@ -440,6 +440,12 @@ _thread_kern_sched(struct sigcontext * s
                                _queue_signals = 0;
                        }
 
+                       /*
+                        * Prevent the signal handler from fiddling with this
+                        * thread before its state is set.
+                        */
+                       _queue_signals = 1;
+
                        /* Make the selected thread the current thread: */
                        _set_curthread(pthread_h);
                        curthread = pthread_h;
@@ -480,6 +486,11 @@ _thread_kern_sched(struct sigcontext * s
                         * before use.
                         */
                        curthread = _get_curthread();
+
+                       /* Allow signals again. */
+                       _queue_signals = 0;
+
+                       /* Done with scheduling. */
                        _thread_kern_in_sched = 0;
 
                        /* run any installed switch-hooks */

Reply via email to