On Tue, Dec 01, 2020 at 10:27:15AM -0300, Martin Pieuchot wrote:
> On 01/12/20(Tue) 10:21, Claudio Jeker wrote:
> > On Mon, Nov 30, 2020 at 07:19:28PM -0300, Martin Pieuchot wrote:
> > > On 04/11/20(Wed) 11:19, Martin Pieuchot wrote:
> > > > Here's a 3rd approach to solve the TOCTOU race in single_thread_set().
> > > > The issue being that the lock serializing access to `ps_single' is not
> > > > held when calling single_thread_check().
> > > > 
> > > > The approach below is controversial because it extends the scope of the
> > > > SCHED_LOCK().  On the other hand, the two others approaches that both
> > > > add a new lock to avoid this race ignore the fact that accesses to
> > > > `ps_single' are currently not clearly serialized w/o KERNEL_LOCK().
> > > > 
> > > > So the diff below improves the situation in that regard and do not add
> > > > more complexity due to the use of multiple locks.  After having looked
> > > > for a way to split the SCHED_LOCK() I believe this is the simplest
> > > > approach.
> > > > 
> > > > I deliberately used a *_locked() function to avoid grabbing the lock
> > > > recursively as I'm trying to get rid of the recursion, see the other
> > > > thread on tech@.
> > > > 
> > > > That said the uses of `ps_single' in ptrace_ctrl() are not covered by
> > > > this diff and I'd be glad to hear some comments about them.  This is
> > > > fine as long as all the code using `ps_single' runs under KERNEL_LOCK()
> > > > but since we're trying to get the single_thread_* API out of it, this
> > > > need to be addressed.
> > > > 
> > > > Note that this diff introduces a helper for initializing ps_single*
> > > > values in order to keep all the accesses of those fields in the same
> > > > file.
> > > 
> > > Anyone?  With this only the `ps_threads' iteration must receive some
> > > love in order to take the single_thread_* API out of the KERNEL_LOCK().
> > > For that I just sent a SMR_TAILQ conversion diff.
> > > 
> > > Combined with the diff to remove the recursive attribute of the
> > > SCHED_LOCK() we're ready to split it into multiple mutexes.
> > > 
> > > Does that make any sense?  Comments?  Oks?
> > > 
> > > > Index: kern/kern_fork.c
> > > > ===================================================================
> > > > RCS file: /cvs/src/sys/kern/kern_fork.c,v
> > > > retrieving revision 1.226
> > > > diff -u -p -r1.226 kern_fork.c
> > > > --- kern/kern_fork.c    25 Oct 2020 01:55:18 -0000      1.226
> > > > +++ kern/kern_fork.c    4 Nov 2020 12:52:54 -0000
> > > > @@ -563,10 +563,7 @@ thread_fork(struct proc *curp, void *sta
> > > >          * if somebody else wants to take us to single threaded mode,
> > > >          * count ourselves in.
> > > >          */
> > > > -       if (pr->ps_single) {
> > > > -               atomic_inc_int(&pr->ps_singlecount);
> > > > -               atomic_setbits_int(&p->p_flag, P_SUSPSINGLE);
> > > > -       }
> > > > +       single_thread_init(p);
> > 
> > This is not the right name for this function. It does not initalize
> > anything. Why is this indirection needed? I would just put the
> > SCHED_LOCK() around this bit. It makes more sense especially with the
> > comment above.
> 
> Updated diff does that.  I introduced a function because it helps me
> having all the locking in the same place.
> 
> > > > Index: kern/kern_sig.c
> > > > ===================================================================
> > > > RCS file: /cvs/src/sys/kern/kern_sig.c,v
> > > > retrieving revision 1.263
> > > > diff -u -p -r1.263 kern_sig.c
> > > > --- kern/kern_sig.c     16 Sep 2020 13:50:42 -0000      1.263
> > > > +++ kern/kern_sig.c     4 Nov 2020 12:38:35 -0000
> > > > @@ -1932,11 +1932,27 @@ userret(struct proc *p)
> > > >         p->p_cpu->ci_schedstate.spc_curpriority = p->p_usrpri;
> > > >  }
> > > >  
> > > > +void
> > > > +single_thread_init(struct proc *p)
> > > > +{
> > > > +       struct process *pr = p->p_p;
> > > > +       int s;
> > > > +
> > > > +       SCHED_LOCK(s);
> > > > +       if (pr->ps_single) {
> > > > +               atomic_inc_int(&pr->ps_singlecount);
> > > > +               atomic_setbits_int(&p->p_flag, P_SUSPSINGLE);
> > > > +       }
> > > > +       SCHED_UNLOCK(s);
> > > > +}
> > > > +
> > > >  int
> > > > -single_thread_check(struct proc *p, int deep)
> > > > +_single_thread_check_locked(struct proc *p, int deep)
> > 
> > Please don't add the leading _ to this function. There is no need for it.
> 
> Done.
> 
> > > > @@ -2014,7 +2043,6 @@ single_thread_set(struct proc *p, enum s
> > > >                 panic("single_thread_mode = %d", mode);
> > > >  #endif
> > > >         }
> > > > -       SCHED_LOCK(s);
> > > >         pr->ps_singlecount = 0;
> > > >         membar_producer();
> > > >         pr->ps_single = p;
> > 
> > I think you can remove the membar_producer() here. It is of no use
> > anymore.
> 
> I can indeed because SCHED_LOCK() is currently taken by sleep_setup().
> I'd argue this is an implementation detail and I hope to change that
> soon.  So unless I'm completely mistaken, I'd keep it.  But again I
> don't have a strong opinion.

Me neither. I just came to the conclusion that ps_singlecount and
ps_single should be protected by a mutex (or for now SCHED_LOCK)
because the way the interactions work it is to complex to do this
lockless (at least the return on investment is not in favor here).
 
> Index: kern/kern_fork.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/kern_fork.c,v
> retrieving revision 1.226
> diff -u -p -r1.226 kern_fork.c
> --- kern/kern_fork.c  25 Oct 2020 01:55:18 -0000      1.226
> +++ kern/kern_fork.c  1 Dec 2020 13:04:37 -0000
> @@ -516,7 +516,7 @@ thread_fork(struct proc *curp, void *sta
>       struct proc *p;
>       pid_t tid;
>       vaddr_t uaddr;
> -     int error;
> +     int s, error;
>  
>       if (stack == NULL)
>               return EINVAL;
> @@ -563,10 +563,12 @@ thread_fork(struct proc *curp, void *sta
>        * if somebody else wants to take us to single threaded mode,
>        * count ourselves in.
>        */
> +     SCHED_LOCK(s);
>       if (pr->ps_single) {
>               atomic_inc_int(&pr->ps_singlecount);
>               atomic_setbits_int(&p->p_flag, P_SUSPSINGLE);
>       }
> +     SCHED_UNLOCK(s);
>  
>       /*
>        * Return tid to parent thread and copy it out to userspace
> Index: kern/kern_sig.c
> ===================================================================
> RCS file: /cvs/src/sys/kern/kern_sig.c,v
> retrieving revision 1.264
> diff -u -p -r1.264 kern_sig.c
> --- kern/kern_sig.c   8 Nov 2020 20:37:24 -0000       1.264
> +++ kern/kern_sig.c   1 Dec 2020 13:04:48 -0000
> @@ -1941,10 +1941,12 @@ userret(struct proc *p)
>  }
>  
>  int
> -single_thread_check(struct proc *p, int deep)
> +single_thread_check_locked(struct proc *p, int deep)
>  {
>       struct process *pr = p->p_p;
>  
> +     SCHED_ASSERT_LOCKED();
> +
>       if (pr->ps_single != NULL && pr->ps_single != p) {
>               do {
>                       int s;
> @@ -1957,14 +1959,12 @@ single_thread_check(struct proc *p, int 
>                                       return (EINTR);
>                       }
>  
> -                     SCHED_LOCK(s);
> -                     if (pr->ps_single == NULL) {
> -                             SCHED_UNLOCK(s);
> +                     if (pr->ps_single == NULL)
>                               continue;
> -                     }
>  
>                       if (atomic_dec_int_nv(&pr->ps_singlecount) == 0)
>                               wakeup(&pr->ps_singlecount);
> +
>                       if (pr->ps_flags & PS_SINGLEEXIT) {
>                               SCHED_UNLOCK(s);
>                               KERNEL_LOCK();
> @@ -1975,13 +1975,24 @@ single_thread_check(struct proc *p, int 
>                       /* not exiting and don't need to unwind, so suspend */
>                       p->p_stat = SSTOP;
>                       mi_switch();
> -                     SCHED_UNLOCK(s);
>               } while (pr->ps_single != NULL);
>       }
>  
>       return (0);
>  }
>  
> +int
> +single_thread_check(struct proc *p, int deep)
> +{
> +     int s, error;
> +
> +     SCHED_LOCK(s);
> +     error = single_thread_check_locked(p, deep);
> +     SCHED_UNLOCK(s);
> +
> +     return error;
> +}
> +
>  /*
>   * Stop other threads in the process.  The mode controls how and
>   * where the other threads should stop:
> @@ -2003,8 +2014,12 @@ single_thread_set(struct proc *p, enum s
>       KERNEL_ASSERT_LOCKED();
>       KASSERT(curproc == p);
>  
> -     if ((error = single_thread_check(p, deep)))
> +     SCHED_LOCK(s);
> +     error = single_thread_check_locked(p, deep);
> +     if (error) {
> +             SCHED_UNLOCK(s);
>               return error;
> +     }
>  
>       switch (mode) {
>       case SINGLE_SUSPEND:
> @@ -2022,7 +2037,6 @@ single_thread_set(struct proc *p, enum s
>               panic("single_thread_mode = %d", mode);
>  #endif
>       }
> -     SCHED_LOCK(s);
>       pr->ps_singlecount = 0;
>       membar_producer();
>       pr->ps_single = p;
> 

OK claudio@ for this

-- 
:wq Claudio

Reply via email to