: > On Fri, 15 Mar 2019 14:26:10 -0700 > Eric Dumazet <eric.duma...@gmail.com> wrote: > >> On 03/15/2019 02:08 PM, Stefano Brivio wrote: >>> On Fri, 15 Mar 2019 11:56:01 -0700 >>> Eric Dumazet <eric.duma...@gmail.com> wrote: >>> >>>> On 03/15/2019 11:02 AM, David Miller wrote: >>>>> From: Eric Dumazet <eric.duma...@gmail.com> >>>>> Date: Fri, 15 Mar 2019 09:06:25 -0700 >>>>> >>>>>> >>>>>> >>>>>> On 03/15/2019 08:28 AM, Stefano Brivio wrote: >>>>>>> On Fri, 15 Mar 2019 23:18:52 +0800 >>>>>>> Zhiqiang Liu <liuzhiqian...@huawei.com> wrote: >>>>>>> >>>>>>>> In vxlan_destroy_tunnels func, unregister_netdevice_queue is called >>>>>>>> after >>>>>>>> gro_cells_destroy func. However, in unregister_netdevice_queue func, >>>>>>>> the >>>>>>>> gro_cells_destroy func will also call the gro_cells_destroy func as the >>>>>>>> following routine: >>>>>>>> unregister_netdevice_many() -> rollback_registered_many() >>>>>>>> -> ndo_uninit() -> gro_cells_destroy() >>>>>>>> >>>>>>> >>>>>>> NACK, please read my and Eric's comments to v1 -- giving me more than 23 >>>>>>> minutes to answer would have been a nice touch as well :) >>>>>>> >>>>>> >>>>>> Sorry for the confusion, I forgot to add the question marks to my >>>>>> sentences. >>>>>> >>>>>> In fact, this is a bug fix, that we missed in the previous fix. >>>>>> >>>>>> Technically the bug is older. >>>>> >>>>> Please elaborate. >>>>> >>>> >>>> Commit ad6c9986bcb62 >>>> ("vxlan: Fix GRO cells race condition between receive and link delete") >>>> >>>> fixed a race condition for the typical case a vxlan device is dismantled >>>> from the >>>> current netns. >>>> >>>> But if a netns is dismantled, we call vxlan_destroy_tunnels() >>>> to schedule a unregister_netdevice_queue() of all the vxlan tunnels >>>> that are related to this netns. >>> >>> Won't that happen via ops_exit_list() only after synchronize_rcu() is >>> called by cleanup_net(), though? Is there another path I missed? >> >> Just look at vxlan_destroy_tunnels(). >> >> The call to gro_cells_destroy(&vxlan->gro_cells); >> is done _before_ >> unregister_netdevice_queue(vxlan->dev, head); >> >> So packets can still fly, the RCU grace period has not yet started. > > Wait, what... :/ thanks for pointing that out, I guess it was too > obvious for me to notice. > > Zhiqiang, could you maybe update the commit message with these two bits > of information (the real issue explained by Eric, and the different > Fixes: tag), and post v3? > > This would be an actual fix and not a clean-up, so it doesn't need to > wait for net-next to re-open.
Thanks for your reply. I have updated the commit message as suggested by Eric. Even though I have read Documentation/networking/netdev-FAQ.rst as you mentioned. I am now still a little confused about the subject-prefix of v3 (net or net-next). And David Miller saied the net-next tree is CLOSED. Could you help me check whether the following v3 patch is ok? Subject: [PATCH net v3] vxlan: remove the redundant gro_cells_destroy() calling. OR Subject: [PATCH net-next v3] vxlan: remove the redundant gro_cells_destroy() calling. Commit ad6c9986bcb62 ("vxlan: Fix GRO cells race condition between receive and link delete") fixed a race condition for the typical case a vxlan device is dismantled from the current netns. But if a netns is dismantled, vxlan_destroy_tunnels() is called to schedule a unregister_netdevice_queue() of all the vxlan tunnels that are related to this netns. In vxlan_destroy_tunnels(), gro_cells_destroy() is called and finished before unregister_netdevice_queue(). This means that the gro_cells_destroy() call is done too soon, for the same reasons explained in above commit. So we need to fully respect the RCU rules, and thus must remove the gro_cells_destroy() call or risk use after-free. Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer") Signed-off-by: Suanming.Mou <mousuanm...@huawei.com> Suggested-by: Eric Dumazet <eric.duma...@gmail.com> Reviewed-by: Stefano Brivio <sbri...@redhat.com> Reviewed-by: Zhiqiang Liu <liuzhiqian...@huawei.com> --- V1->V2: - update the commit message suggeted by Eric Dumazet - update Fixes: tag drivers/net/vxlan.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 077f1b9f2761..d76dfed8d9bb 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -4335,10 +4335,8 @@ static void vxlan_destroy_tunnels(struct net *net, struct list_head *head) /* If vxlan->dev is in the same netns, it has already been added * to the list by the previous loop. */ - if (!net_eq(dev_net(vxlan->dev), net)) { - gro_cells_destroy(&vxlan->gro_cells); + if (!net_eq(dev_net(vxlan->dev), net)) unregister_netdevice_queue(vxlan->dev, head); - } } for (h = 0; h < PORT_HASH_SIZE; ++h) -- 1.7.12.4