Jim Newsome <[email protected]> writes: > On 3/12/21 14:29, Eric W. Biederman wrote: >> When I looked at this a second time it became apparent that using >> pid_task twice should actually be faster as it removes a dependent load >> caused by thread_group_leader, and replaces it by accessing two adjacent >> pointers in the same cache line. >> >> I know the algorithmic improvement is the main advantage, but removing >> 60ns or so for a dependent load can't hurt. >> >> Plus I think using the two pid types really makes it clear that one >> is always a process and the other is always potentially a thread. >> >> /* >> * Optimization for waiting on PIDTYPE_PID. No need to iterate through child >> * and tracee lists to find the target task. >> */ >> static int do_wait_pid(struct wait_opts *wo) >> { >> bool ptrace; >> struct task_struct *target; >> int retval; >> >> ptrace = false; >> target = pid_task(wo->wo_pid, PIDTYPE_TGID); >> if (target && is_effectively_child(wo, ptrace, target)) { >> retval = wait_consider_task(wo, ptrace, target); >> if (retval) >> return retval; >> } >> >> ptrace = true; >> target = pid_task(wo->wo_pid, PIDTYPE_PID); >> if (target && target->ptrace && >> is_effectively_child(wo, ptrace, target)) { >> retval = wait_consider_task(wo, ptrace, target); >> if (retval) >> return retval; >> } >> >> return 0; >> } > > I'm fine with either way. > > Part of what made my earlier version with the double-lookup a bit > awkward was only doing the second lookup if the first lookup failed. I'm > happy to take your word though that making the second lookup conditional > is unnecessary or even detrimental :).
Oh absolutely. The two lookups are independent. > It did cross my mind that it > might not be a very consistent branch for a branch-predictor, but I also > figured pid_task's synchronization might outweigh that. pid_task has a lot of verbiage but it is only reading a pointer, verifying the pointer is not NULL and calling container_of on the result of the pointer read. Eric

