Hi, On Wed, 2018-06-06 at 13:25 +0300, Kirill Tkhai wrote: > Hi, Paolo, > > below is couple my thoughts about this. > > On 06.06.2018 12:44, Paolo Abeni wrote: > > On Tue, 2018-06-05 at 18:06 +0200, Paolo Abeni wrote: > > > On Tue, 2018-06-05 at 08:35 -0700, Tom Herbert wrote: > > > > Paolo, thanks for looking into this! Can you try replacing > > > > __skb_dequeue in requeue_rx_msgs with skb_dequeue to see if that is > > > > the fix. > > > > > > Sure, I'll retrigger the test, and report the result here (or directly > > > a new patch, should the test be succesful) > > > > Contrary to my expectations, the suggested change does not fix the > > issue. I'm still investigating the overall locking schema. > > kcm_rcv_strparser()->unreserve_rx_kcm()->requeue_rx_msgs()->__skb_dequeue() > > seems needed to be synchronized with: > > kcm_recvmsg()->kcm_wait_data(). > > Otherwise, requeue_rx_msgs() removes kcm_recvmsg() peeked skb. > > The solution could be to take lock_sock(&kcm->sk) in requeue_rx_msgs(), but > we can't do that since there is already locked another socket (and > potentially, > this may be a reason of deadlock). > > The approach you made in initial patch seems good for me to solve this > problem. > The only thing I'm not sure is either lock_sock() is needed in kcm_recvmsg() > after > this.
Thank you for the feedback! I tried a different approach (add en explicit 'peek' argument to kcm_wait_data, and dequeue the packet there if not explicitly asked otherwise). It solves the issue and looks reasonably clean. I'll post the patch soon. Cheers, Paolo