On 08/03/21(Mon) 12:37, Claudio Jeker wrote:
> On Mon, Mar 08, 2021 at 12:11:54PM +0100, Martin Pieuchot wrote:
> [...]
> > This diff targets a specific problem which is to make sure `ps_single'
> > dereferences are coherent if this value is being modified w/o KERNEL_LOCK().
> > It doesn't revisit/clarify the relation between the uses of `ps_single'
> > in ptrace and parking code. This can, IMHO, be done in a later step.
>
> It only ensures that ps_single is coherent but not that data read from
> that pointer is coherent.
Yes, that's exactly the point of this diff.
> > > > > @@ -541,10 +543,11 @@ loop:
> > > > > proc_finish_wait(q, p);
> > > > > return (0);
> > > > > }
> > > > > +
> > > > > + st = READ_ONCE(pr->ps_single);
> > > > > if (pr->ps_flags & PS_TRACED &&
> > > > > - (pr->ps_flags & PS_WAITED) == 0 && pr->ps_single &&
> > > > > - pr->ps_single->p_stat == SSTOP &&
> > > > > - (pr->ps_single->p_flag & P_SUSPSINGLE) == 0) {
> > > > > + (pr->ps_flags & PS_WAITED) == 0 && st != NULL &&
> > > > > + st->p_stat == SSTOP && (st->p_flag & P_SUSPSINGLE)
> > > > > == 0) {
> > > > > if (single_thread_wait(pr, 0))
> > > > > goto loop;
> > > > >
> > >
> > > Here you access p_stat and p_flag, as far as I remember p_stat is also
> > > protected by SCHED_LOCK. p_flag is atomic and maybe the check should be
> > > turned. So this decision may not be stable.
> >
> > This is an incoherency which is fine as long as this code is executed
> > with the KERNEL_LOCK().
>
> It is not if the signal handling is no longer using the KERNEL_LOCK.
But it is currently ;)
> Then the thread could be in the process of being stopped and the race
> to enter single_thread_wait() could be lost. The KERNEL_LOCK() alone does
> not prevent single_thread_set() from running.
Sure, there's plenty of races in the existing code if the KERNEL)_LOCK()
is removed. I'm trying to move forward step by step. I sent a simple diff
that fixes a simple problem. Are you suggesting I should send a huge diff?