Jim Newsome <[email protected]> writes:

> On 3/12/21 14:29, Eric W. Biederman wrote:
>> When I looked at this a second time it became apparent that using
>> pid_task twice should actually be faster as it removes a dependent load
>> caused by thread_group_leader, and replaces it by accessing two adjacent
>> pointers in the same cache line.
>> 
>> I know the algorithmic improvement is the main advantage, but removing
>> 60ns or so for a dependent load can't hurt.
>> 
>> Plus I think using the two pid types really makes it clear that one
>> is always a process and the other is always potentially a thread.
>> 
>> /*
>>  * Optimization for waiting on PIDTYPE_PID. No need to iterate through child
>>  * and tracee lists to find the target task.
>>  */
>> static int do_wait_pid(struct wait_opts *wo)
>> {
>>      bool ptrace;
>>      struct task_struct *target;
>>      int retval;
>> 
>>      ptrace = false;
>>      target = pid_task(wo->wo_pid, PIDTYPE_TGID);
>>      if (target && is_effectively_child(wo, ptrace, target)) {
>>              retval = wait_consider_task(wo, ptrace, target);
>>              if (retval)
>>                      return retval;
>>      }
>> 
>>      ptrace = true;
>>      target = pid_task(wo->wo_pid, PIDTYPE_PID);
>>      if (target && target->ptrace &&
>>             is_effectively_child(wo, ptrace, target)) {
>>              retval = wait_consider_task(wo, ptrace, target);
>>              if (retval)
>>                      return retval;
>>      }
>> 
>>      return 0;
>> }
>
> I'm fine with either way.
>
> Part of what made my earlier version with the double-lookup a bit
> awkward was only doing the second lookup if the first lookup failed. I'm
> happy to take your word though that making the second lookup conditional
> is unnecessary or even detrimental :).

Oh absolutely.  The two lookups are independent.

> It did cross my mind that it
> might not be a very consistent branch for a branch-predictor, but I also
> figured pid_task's synchronization might outweigh that.

pid_task has a lot of verbiage but it is only reading a pointer,
verifying the pointer is not NULL and calling container_of on the result
of the pointer read.

Eric

Reply via email to