:
> On Fri, 15 Mar 2019 14:26:10 -0700
> Eric Dumazet <eric.duma...@gmail.com> wrote:
> 
>> On 03/15/2019 02:08 PM, Stefano Brivio wrote:
>>> On Fri, 15 Mar 2019 11:56:01 -0700
>>> Eric Dumazet <eric.duma...@gmail.com> wrote:
>>>   
>>>> On 03/15/2019 11:02 AM, David Miller wrote:  
>>>>> From: Eric Dumazet <eric.duma...@gmail.com>
>>>>> Date: Fri, 15 Mar 2019 09:06:25 -0700
>>>>>     
>>>>>>
>>>>>>
>>>>>> On 03/15/2019 08:28 AM, Stefano Brivio wrote:    
>>>>>>> On Fri, 15 Mar 2019 23:18:52 +0800
>>>>>>> Zhiqiang Liu <liuzhiqian...@huawei.com> wrote:
>>>>>>>    
>>>>>>>> In vxlan_destroy_tunnels func, unregister_netdevice_queue is called 
>>>>>>>> after
>>>>>>>> gro_cells_destroy func. However, in unregister_netdevice_queue func, 
>>>>>>>> the
>>>>>>>> gro_cells_destroy func will also call the gro_cells_destroy func as the
>>>>>>>> following routine:
>>>>>>>> unregister_netdevice_many() -> rollback_registered_many()    
>>>>>>>> -> ndo_uninit() -> gro_cells_destroy()      
>>>>>>>> 
>>>>>>>
>>>>>>> NACK, please read my and Eric's comments to v1 -- giving me more than 23
>>>>>>> minutes to answer would have been a nice touch as well :)
>>>>>>>    
>>>>>>
>>>>>> Sorry for the confusion, I forgot to add the question marks to my 
>>>>>> sentences.
>>>>>>
>>>>>> In fact, this is a bug fix, that we missed in the previous fix.
>>>>>>
>>>>>> Technically the bug is older.    
>>>>>
>>>>> Please elaborate.
>>>>>     
>>>>
>>>> Commit ad6c9986bcb62
>>>> ("vxlan: Fix GRO cells race condition between receive and link delete")
>>>>
>>>> fixed a race condition for the typical case a vxlan device is dismantled 
>>>> from the
>>>> current netns.
>>>>
>>>> But if a netns is dismantled, we call vxlan_destroy_tunnels()
>>>> to schedule a unregister_netdevice_queue() of all the vxlan tunnels
>>>> that are related to this netns.  
>>>
>>> Won't that happen via ops_exit_list() only after synchronize_rcu() is
>>> called by cleanup_net(), though? Is there another path I missed?  
>>
>> Just look at vxlan_destroy_tunnels().
>>
>> The call to gro_cells_destroy(&vxlan->gro_cells);
>> is done _before_ 
>> unregister_netdevice_queue(vxlan->dev, head);
>>
>> So packets can still fly, the RCU grace period has not yet started.
> 
> Wait, what... :/ thanks for pointing that out, I guess it was too
> obvious for me to notice.
> 
> Zhiqiang, could you maybe update the commit message with these two bits
> of information (the real issue explained by Eric, and the different
> Fixes: tag), and post v3?
> 
> This would be an actual fix and not a clean-up, so it doesn't need to
> wait for net-next to re-open.

Thanks for your reply.
I have updated the commit message as suggested by Eric. Even though I have read
Documentation/networking/netdev-FAQ.rst as you mentioned. I am now still a 
little
confused about the subject-prefix of v3 (net or net-next). And David Miller 
saied
the net-next tree is CLOSED.
Could you help me check whether the following v3 patch is ok?


Subject: [PATCH net v3] vxlan: remove the redundant gro_cells_destroy() calling.
OR
Subject: [PATCH net-next v3] vxlan: remove the redundant gro_cells_destroy() 
calling.

Commit ad6c9986bcb62 ("vxlan: Fix GRO cells race condition between
receive and link delete") fixed a race condition for the typical case a vxlan
device is dismantled from the current netns. But if a netns is dismantled,
vxlan_destroy_tunnels() is called to schedule a unregister_netdevice_queue()
of all the vxlan tunnels that are related to this netns.

In vxlan_destroy_tunnels(), gro_cells_destroy() is called and finished before
unregister_netdevice_queue(). This means that the gro_cells_destroy() call is
done too soon, for the same reasons explained in above commit.

So we need to fully respect the RCU rules, and thus must remove the
gro_cells_destroy() call or risk use after-free.

Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer")
Signed-off-by: Suanming.Mou <mousuanm...@huawei.com>
Suggested-by: Eric Dumazet <eric.duma...@gmail.com>
Reviewed-by: Stefano Brivio <sbri...@redhat.com>
Reviewed-by: Zhiqiang Liu <liuzhiqian...@huawei.com>
---
V1->V2:
        - update the commit message suggeted by Eric Dumazet
        - update Fixes: tag

 drivers/net/vxlan.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 077f1b9f2761..d76dfed8d9bb 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -4335,10 +4335,8 @@ static void vxlan_destroy_tunnels(struct net *net, 
struct list_head *head)
                /* If vxlan->dev is in the same netns, it has already been added
                 * to the list by the previous loop.
                 */
-               if (!net_eq(dev_net(vxlan->dev), net)) {
-                       gro_cells_destroy(&vxlan->gro_cells);
+               if (!net_eq(dev_net(vxlan->dev), net))
                        unregister_netdevice_queue(vxlan->dev, head);
-               }
        }

        for (h = 0; h < PORT_HASH_SIZE; ++h)
-- 
1.7.12.4



Reply via email to