Hi, Paolo,

below is couple my thoughts about this.

On 06.06.2018 12:44, Paolo Abeni wrote:
> On Tue, 2018-06-05 at 18:06 +0200, Paolo Abeni wrote:
>> On Tue, 2018-06-05 at 08:35 -0700, Tom Herbert wrote:
>>> Paolo, thanks for looking into this! Can you try replacing
>>> __skb_dequeue in requeue_rx_msgs with skb_dequeue to see if that is
>>> the fix.
>>
>> Sure, I'll retrigger the test, and report the result here (or directly
>> a new patch, should the test be succesful)
> 
> Contrary to my expectations, the suggested change does not fix the
> issue. I'm still investigating the overall locking schema.

kcm_rcv_strparser()->unreserve_rx_kcm()->requeue_rx_msgs()->__skb_dequeue()

seems needed to be synchronized with:

kcm_recvmsg()->kcm_wait_data().

Otherwise, requeue_rx_msgs() removes kcm_recvmsg() peeked skb.

The solution could be to take lock_sock(&kcm->sk) in requeue_rx_msgs(), but
we can't do that since there is already locked another socket (and potentially,
this may be a reason of deadlock).

The approach you made in initial patch seems good for me to solve this problem.
The only thing I'm not sure is either lock_sock() is needed in kcm_recvmsg() 
after
this.

Thanks,
Kirill

Reply via email to