On 01/30/2019 09:01 PM, Alexei Starovoitov wrote:
> On Wed, Jan 30, 2019 at 04:32:12PM -0500, Waiman Long wrote:
>> On 01/30/2019 04:11 PM, Waiman Long wrote:
>>> On 01/30/2019 03:10 PM, Alexei Starovoitov wrote:
>>>> On Wed, Jan 30, 2019 at 02:42:23PM -0500, Waiman Long wrote:
>>>>> On 01/30/2019 02:30 PM, Alexei Starovoitov wrote:
>>>>>> On Wed, Jan 30, 2019 at 11:15:30AM +0100, Peter Zijlstra wrote:
>>>>>>> On Tue, Jan 29, 2019 at 08:04:56PM -0800, Alexei Starovoitov wrote:
>>>>>>>> Lockdep warns about false positive:
>>>>>>> This is not a false positive, and you probably also need to use
>>>>>>> down_read_non_owner() to match this up_read_non_owner().
>>>>>>>
>>>>>>> {up,down}_read() and {up,down}_read_non_owner() are not only different
>>>>>>> in the lockdep annotation; there is also optimistic spin stuff that
>>>>>>> relies on 'owner' tracking.
>>>>>> Can you point out in the code the spin bit?
>>>>>> As far as I can see sem->owner is debug only feature.
>>>>>> All owner checks are done under CONFIG_DEBUG_RWSEMS.
>>>>> No, sem->owner is mainly for performing optimistic spinning which is a
>>>>> performance feature to make rwsem writer-lock performs similar to mutex.
>>>>> The debugging part is just an add-on. It is not the reason for the
>>>>> presence of sem->owner.
>>>> I see. Got it.
>>>>
>>>>>> Also there is no down_read_trylock_non_owner() at the moment.
>>>>>> We can argue about it for -next, but I'd rather silence lockdep
>>>>>> with this patch today.
>>>>>>
>>>>> We can add down_read_trylock_non_owner() if there is a need for it. It
>>>>> should be easy to do.
>>>> Yes, but looking through the code it's not clear to me that it's safe
>>>> to mix non_owner() versions with regular.
>>>> bpf/stackmap.c does down_read_trylock + up_read.
>>>> If we add new down_read_trylock_non_owner that set the owner to
>>>> NULL | RWSEM_* bits is this safe with conccurent read/write
>>>> that do regular versions?
>>>> rwsem_can_spin_on_owner() does:
>>>>         if (owner) {
>>>>                 ret = is_rwsem_owner_spinnable(owner) &&
>>>>                       owner_on_cpu(owner);
>>>> that looks correct.
>>>> For a second I thought there could be fault here due to non_owner.
>>>> But there could be other places where it's assumed that owner
>>>> is never null?
>>> The content of owner is not the cause of the lockdep warning. The
>>> lockdep code assumes that the task that acquires the lock will release
>>> it some time later. That is not the case when you need to acquire the
>>> lock by one task and released by another. In this case, you have to use
>>> the non_owner version of down/up_read which disable the lockdep
>>> acquire/release tracking. That will be the only difference between the
>>> two set of APIs.
>>>
>>> Cheers,
>>> Longman
>> BTW, you may want to do something like that to make sure that the lock
>> ownership is probably tracked.
>>
>> -Longman
>>
>> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
>> index d43b145..79eef9d 100644
>> --- a/kernel/bpf/stackmap.c
>> +++ b/kernel/bpf/stackmap.c
>> @@ -338,6 +338,13 @@ static void stack_map_get_build_id_offset(struct
>> bpf_stack_
>>         } else {
>>                 work->sem = &current->mm->mmap_sem;
>>                 irq_work_queue(&work->irq_work);
>> +
>> +               /*
>> +                * The irq_work will release the mmap_sem with
>> +                * up_read_non_owner(). The rwsem_release() is called
>> +                * here to release the lock from lockdep's perspective.
>> +                */
>> +               rwsem_release(&current->mm->mmap_sem.dep_map, 1, _RET_IP_);
> are you saying the above is enough coupled with up_read_non_owner?
> Looking at how amdgpu is using this api... I think they just use non_owner
> version when doing things from different task.
> So I don't think pairing non_owner with non_owner is strictly necessary.
> It seems fine to use down_read_trylock() with above rwsem_release() hack
> plus up_read_non_owner().
> Am I missing something?
>
The statement above is to clear the state for the lockdep so that it
knows that the task no longer owns the lock. Without doing that, there
is a possibility of some other kind of incorrect lockdep warning may be
produced because the code will still think the task hold a read-lock on
the mmap_sem. It is also possible no warning will be reported.

The bpf code is kind of special that it acquires the mmap_sem. Later on,
it either releases it itself (non-NMI) or request irq_work to release it
(NMI). So either, you use the _non_owner versions for both acquire and
release or fake the release like the code segment above.

Peter, please chime in if you have other suggestion.

Cheers,
Longman


Reply via email to