On Mon, Mar 08, 2021 at 12:11:54PM +0100, Martin Pieuchot wrote: > On 08/03/21(Mon) 11:57, Claudio Jeker wrote: > > On Mon, Mar 08, 2021 at 11:06:44AM +0100, Martin Pieuchot wrote: > > > On 05/03/21(Fri) 11:30, Martin Pieuchot wrote: > > > > On 04/03/21(Thu) 11:45, Mark Kettenis wrote: > > > > > > Date: Thu, 4 Mar 2021 11:19:23 +0100 > > > > > > From: Martin Pieuchot <m...@openbsd.org> > > > > > > > > > > > > On 04/03/21(Thu) 11:01, Mark Kettenis wrote: > > > > > > > > Date: Thu, 4 Mar 2021 10:54:48 +0100 > > > > > > > > From: Patrick Wildt <patr...@blueri.se> > > > > > > > > > > > > > > > > Am Thu, Mar 04, 2021 at 10:42:24AM +0100 schrieb Mark Kettenis: > > > > > > > > > > Date: Thu, 4 Mar 2021 10:34:24 +0100 > > > > > > > > > > From: Martin Pieuchot <m...@openbsd.org> > > > > > > > > > > > > > > > > > > > > Running t/rw/msleep(9) w/o KERNEL_LOCK() implies that a > > > > > > > > > > thread can > > > > > > > > > > change the value of `ps_single' while one of its siblings > > > > > > > > > > might be > > > > > > > > > > dereferencing it. > > > > > > > > > > > > > > > > > > > > To prevent inconsistencies in the code executed by sibling > > > > > > > > > > thread, the > > > > > > > > > > diff below makes sure `ps_single' is dereferenced only once > > > > > > > > > > in various > > > > > > > > > > parts of the kernel. > > > > > > > > > > > > > > > > > > > > ok? > > > > > > > > > > > > > > > > > > I think that means that ps_single has to be declared > > > > > > > > > "volatile". > > > > > > > > > > > > > > > > Isn't there the READ_ONCE(x) macro, that does exactly that? > > > > > > > > > > > > > > Not a big fan of READ_ONCE() and WRITE_ONCE(), but apparently > > > > > > > those > > > > > > > are needed to comply with the alpha memory model. At least in > > > > > > > some > > > > > > > cases... > > > > > > > > > > > > Updated diff using READ_ONCE(), ok? > > > > > > > > > > If you use READ_ONCE() you shoul also use WRITE_ONCE() everywhere > > > > > where you modify ps_single isn't it? > > > > > > > > I don't know, I'm learning how to do it. I'd appreciate if somebody > > > > could > > > > come with a READ_ONCE(9) manual explaining how this API should be used. > > > > > > > > Updated diff including the WRITE_ONCE(). > > > > > > Any ok? > > > > The one thing that bothers me is that we decided that ps_single needs the > > SCHED_LOCK but now this becomes a bit of a mishmash. > > I hear what you're saying. > > I'm currently concentrating on moving cursig() out of the KERNEL_LOCK() > and I'd appreciate not be blocked on discussions on which locking/lock-free > solution is the best for making the parking code mp-safe.
The problem is that you may introduce hard to debug issues in the parking code. Been there done that. > This diff targets a specific problem which is to make sure `ps_single' > dereferences are coherent if this value is being modified w/o KERNEL_LOCK(). > It doesn't revisit/clarify the relation between the uses of `ps_single' > in ptrace and parking code. This can, IMHO, be done in a later step. It only ensures that ps_single is coherent but not that data read from that pointer is coherent. > > > > @@ -510,7 +512,7 @@ dowait4(struct proc *q, pid_t pid, int * > > > > { > > > > int nfound; > > > > struct process *pr; > > > > - struct proc *p; > > > > + struct proc *p, *st; > > > > int error; > > > > > > > > if (pid == 0) > > > > @@ -541,10 +543,11 @@ loop: > > > > proc_finish_wait(q, p); > > > > return (0); > > > > } > > > > + > > > > + st = READ_ONCE(pr->ps_single); > > > > if (pr->ps_flags & PS_TRACED && > > > > - (pr->ps_flags & PS_WAITED) == 0 && pr->ps_single && > > > > - pr->ps_single->p_stat == SSTOP && > > > > - (pr->ps_single->p_flag & P_SUSPSINGLE) == 0) { > > > > + (pr->ps_flags & PS_WAITED) == 0 && st != NULL && > > > > + st->p_stat == SSTOP && (st->p_flag & P_SUSPSINGLE) > > > > == 0) { > > > > if (single_thread_wait(pr, 0)) > > > > goto loop; > > > > > > > > Here you access p_stat and p_flag, as far as I remember p_stat is also > > protected by SCHED_LOCK. p_flag is atomic and maybe the check should be > > turned. So this decision may not be stable. > > This is an incoherency which is fine as long as this code is executed > with the KERNEL_LOCK(). It is not if the signal handling is no longer using the KERNEL_LOCK. Then the thread could be in the process of being stopped and the race to enter single_thread_wait() could be lost. The KERNEL_LOCK() alone does not prevent single_thread_set() from running. > > > > Index: kern/sys_process.c > > > > =================================================================== > > > > RCS file: /cvs/src/sys/kern/sys_process.c,v > > > > retrieving revision 1.86 > > > > diff -u -p -r1.86 sys_process.c > > > > --- kern/sys_process.c 8 Feb 2021 10:51:02 -0000 1.86 > > > > +++ kern/sys_process.c 5 Mar 2021 10:28:06 -0000 > > > > @@ -273,7 +273,7 @@ sys_ptrace(struct proc *p, void *v, regi > > > > int > > > > ptrace_ctrl(struct proc *p, int req, pid_t pid, caddr_t addr, int data) > > > > { > > > > - struct proc *t; /* target thread */ > > > > + struct proc *st, *t; /* target thread */ > > > > struct process *tr; /* target process */ > > > > int error = 0; > > > > int s; > > > > @@ -433,8 +433,9 @@ ptrace_ctrl(struct proc *p, int req, pid > > > > * from where it stopped." > > > > */ > > > > > > > > - if (pid < THREAD_PID_OFFSET && tr->ps_single) > > > > - t = tr->ps_single; > > > > + st = READ_ONCE(tr->ps_single); > > > > + if (pid < THREAD_PID_OFFSET && st != NULL) > > > > + t = st; > > > > > > > > /* If the address parameter is not (int *)1, set the > > > > pc. */ > > > > if ((int *)addr != (int *)1) > > > > @@ -464,8 +465,9 @@ ptrace_ctrl(struct proc *p, int req, pid > > > > * from where it stopped." > > > > */ > > > > > > > > - if (pid < THREAD_PID_OFFSET && tr->ps_single) > > > > - t = tr->ps_single; > > > > + st = READ_ONCE(tr->ps_single); > > > > + if (pid < THREAD_PID_OFFSET && st != NULL) > > > > + t = st; > > > > > > > > #ifdef PT_STEP > > > > /* > > > > @@ -495,8 +497,9 @@ ptrace_ctrl(struct proc *p, int req, pid > > > > break; > > > > > > > > case PT_KILL: > > > > - if (pid < THREAD_PID_OFFSET && tr->ps_single) > > > > - t = tr->ps_single; > > > > + st = READ_ONCE(tr->ps_single); > > > > + if (pid < THREAD_PID_OFFSET && st != NULL) > > > > + t = st; > > > > > > > > /* just send the process a KILL signal. */ > > > > data = SIGKILL; > > > > @@ -536,6 +539,7 @@ int > > > > ptrace_kstate(struct proc *p, int req, pid_t pid, void *addr) > > > > { > > > > struct process *tr; /* target process */ > > > > + struct proc *st; > > > > struct ptrace_event *pe = addr; > > > > int error; > > > > > > > > @@ -582,9 +586,9 @@ ptrace_kstate(struct proc *p, int req, p > > > > tr->ps_ptmask = pe->pe_set_event; > > > > break; > > > > case PT_GET_PROCESS_STATE: > > > > - if (tr->ps_single) > > > > - tr->ps_ptstat->pe_tid = > > > > - tr->ps_single->p_tid + THREAD_PID_OFFSET; > > > > + st = READ_ONCE(tr->ps_single); > > > > + if (st != NULL) > > > > + tr->ps_ptstat->pe_tid = st->p_tid + > > > > THREAD_PID_OFFSET; > > > > memcpy(addr, tr->ps_ptstat, sizeof *tr->ps_ptstat); > > > > break; > > > > default: > > > > > > > > > > > Especially these ptrace bits make me a bit uneasy but I did not understand > > the ptrace code well enough. > -- :wq Claudio