On 4/1/26 01:18, Kuniyuki Iwashima wrote:
> On Tue, Mar 31, 2026 at 3:44 PM Michal Luczaj <[email protected]> wrote:
>>
>> On 3/31/26 01:27, Kuniyuki Iwashima wrote:
>>> On Mon, Mar 30, 2026 at 4:04 PM Michal Luczaj <[email protected]> wrote:
>>>> On 3/26/26 07:26, Martin KaFai Lau wrote:
>>>>> On 3/15/26 4:58 PM, Michal Luczaj wrote:
>>>>>>> Beside, from looking at the may_update_sockmap(), I don't know if it is
>>>>>>> even doable (or useful) to bpf_map_update_elem(unix_sk) in
>>>>>>> tc/flow_dissector/xdp. One possible path is the SOCK_FILTER when looking
>>>>>>> at unix_dgram_sendmsg() => sk_filter(). It was not the original use case
>>>>>>> when the bpf_map_update_elem(sockmap) support was added iirc.
>>>>>>
>>>>>> What about a situation when unix_sk is stored in a sockmap, then tc prog
>>>>>> looks it up and invokes bpf_map_update_elem(unix_sk)? I'm not sure it's
>>>>>> useful, but seems doable.
>>>>>
>>>>> [ Sorry for the late reply ]
>>>>>
>>>>> It is a bummer that the bpf_map_update_elem(unix_sk) path is possible
>>>>> from tc :(
>>>>>
>>>>> Then unix_state_lock() in its current form cannot be safely acquired in
>>>>> sock_map_update_elem(). It is currently a spin_lock() instead of
>>>>> spin_lock_bh().
>>>>
>>>> Is there a specific deadlock you have in your mind?
>>>
>>> lockdep would complain if we used the same lock from
>>> different contexts.
>>>
>>> e.g.)
>>> Process context holding unix_state_lock() with the normal spin_lock()
>>> -> BH interrupt
>>> -> tc prog trying to hold the same lock with spin_lock(). (_bh())
>>> -> deadlock
>>
>> OK, I get it, thanks.
>>
>> So here's one more idea: the null-ptr-deref issue is connect() racing
>> against sock_map_update_elem_*SYS*() coming from user-space, not the
>> can-be-called-from-BH sock_map_update_elem() variant. So can't we assume
>> that for any sock_map_update_elem(unix_sk) invoked by a tc prog, unix_sk
>> will always be "stable", i.e. in a state that cannot lead to that
>> null-ptr-deref?
>>
>> IOW, if for a tc prog the only way to get hold of unix_sk is look it up in
>> a sockmap, then (by the fact that unix_sk _is_ in the sockmap) unix_sk will
>> be already safe to use by sock_map_update_elem() without taking the af_unix
>> state lock.
>>
>> Long story short: take the unix state lock in sock_map_update_elem_sys(),
>> don't bother in sock_map_update_elem()?
> 
> but it will prevents bpf iter from taking unix_state_lock().

How come? bpf iter can take unix_state_lock() and the prog is free to
invoke sock_map_update_elem(). It's the _sys variant that would be taking
unix_state_lock() by itself. Or is there some other locking complexity I'm
missing?

> I don't see a good reason to introduce a new locking rule by unnecessarily
> wrapping the entire sockmap update with unix_state_lock() even though
> the root bug can be avoided by a simple null check in a leaf function.

Yeah, I get that point. I just assumed the root bug was indeed sockmap
update missing a lock.


Reply via email to