On 08/03/21(Mon) 12:37, Claudio Jeker wrote:
> On Mon, Mar 08, 2021 at 12:11:54PM +0100, Martin Pieuchot wrote:
> [...]  
> > This diff targets a specific problem which is to make sure `ps_single'
> > dereferences are coherent if this value is being modified w/o KERNEL_LOCK().
> > It doesn't revisit/clarify the relation between the uses of `ps_single'
> > in ptrace and parking code.  This can, IMHO, be done in a later step.
> 
> It only ensures that ps_single is coherent but not that data read from
> that pointer is coherent.

Yes, that's exactly the point of this diff.

> > > > > @@ -541,10 +543,11 @@ loop:
> > > > >                       proc_finish_wait(q, p);
> > > > >                       return (0);
> > > > >               }
> > > > > +
> > > > > +             st = READ_ONCE(pr->ps_single);
> > > > >               if (pr->ps_flags & PS_TRACED &&
> > > > > -                 (pr->ps_flags & PS_WAITED) == 0 && pr->ps_single &&
> > > > > -                 pr->ps_single->p_stat == SSTOP &&
> > > > > -                 (pr->ps_single->p_flag & P_SUSPSINGLE) == 0) {
> > > > > +                 (pr->ps_flags & PS_WAITED) == 0 && st != NULL &&
> > > > > +                 st->p_stat == SSTOP && (st->p_flag & P_SUSPSINGLE) 
> > > > > == 0) {
> > > > >                       if (single_thread_wait(pr, 0))
> > > > >                               goto loop;
> > > > >  
> > > 
> > > Here you access p_stat and p_flag, as far as I remember p_stat is also
> > > protected by SCHED_LOCK. p_flag is atomic and maybe the check should be
> > > turned. So this decision may not be stable.
> > 
> > This is an incoherency which is fine as long as this code is executed
> > with the KERNEL_LOCK().
> 
> It is not if the signal handling is no longer using the KERNEL_LOCK.

But it is currently ;)

> Then the thread could be in the process of being stopped and the race
> to enter single_thread_wait() could be lost. The KERNEL_LOCK() alone does
> not prevent single_thread_set() from running.

Sure, there's plenty of races in the existing code if the KERNEL)_LOCK()
is removed.  I'm trying to move forward step by step.  I sent a simple diff
that fixes a simple problem.  Are you suggesting I should send a huge diff?

Reply via email to