:
> On Fri, 15 Mar 2019 14:26:10 -0700
> Eric Dumazet <[email protected]> wrote:
>
>> On 03/15/2019 02:08 PM, Stefano Brivio wrote:
>>> On Fri, 15 Mar 2019 11:56:01 -0700
>>> Eric Dumazet <[email protected]> wrote:
>>>
>>>> On 03/15/2019 11:02 AM, David Miller wrote:
>>>>> From: Eric Dumazet <[email protected]>
>>>>> Date: Fri, 15 Mar 2019 09:06:25 -0700
>>>>>
>>>>>>
>>>>>>
>>>>>> On 03/15/2019 08:28 AM, Stefano Brivio wrote:
>>>>>>> On Fri, 15 Mar 2019 23:18:52 +0800
>>>>>>> Zhiqiang Liu <[email protected]> wrote:
>>>>>>>
>>>>>>>> In vxlan_destroy_tunnels func, unregister_netdevice_queue is called
>>>>>>>> after
>>>>>>>> gro_cells_destroy func. However, in unregister_netdevice_queue func,
>>>>>>>> the
>>>>>>>> gro_cells_destroy func will also call the gro_cells_destroy func as the
>>>>>>>> following routine:
>>>>>>>> unregister_netdevice_many() -> rollback_registered_many()
>>>>>>>> -> ndo_uninit() -> gro_cells_destroy()
>>>>>>>>
>>>>>>>
>>>>>>> NACK, please read my and Eric's comments to v1 -- giving me more than 23
>>>>>>> minutes to answer would have been a nice touch as well :)
>>>>>>>
>>>>>>
>>>>>> Sorry for the confusion, I forgot to add the question marks to my
>>>>>> sentences.
>>>>>>
>>>>>> In fact, this is a bug fix, that we missed in the previous fix.
>>>>>>
>>>>>> Technically the bug is older.
>>>>>
>>>>> Please elaborate.
>>>>>
>>>>
>>>> Commit ad6c9986bcb62
>>>> ("vxlan: Fix GRO cells race condition between receive and link delete")
>>>>
>>>> fixed a race condition for the typical case a vxlan device is dismantled
>>>> from the
>>>> current netns.
>>>>
>>>> But if a netns is dismantled, we call vxlan_destroy_tunnels()
>>>> to schedule a unregister_netdevice_queue() of all the vxlan tunnels
>>>> that are related to this netns.
>>>
>>> Won't that happen via ops_exit_list() only after synchronize_rcu() is
>>> called by cleanup_net(), though? Is there another path I missed?
>>
>> Just look at vxlan_destroy_tunnels().
>>
>> The call to gro_cells_destroy(&vxlan->gro_cells);
>> is done _before_
>> unregister_netdevice_queue(vxlan->dev, head);
>>
>> So packets can still fly, the RCU grace period has not yet started.
>
> Wait, what... :/ thanks for pointing that out, I guess it was too
> obvious for me to notice.
>
> Zhiqiang, could you maybe update the commit message with these two bits
> of information (the real issue explained by Eric, and the different
> Fixes: tag), and post v3?
>
> This would be an actual fix and not a clean-up, so it doesn't need to
> wait for net-next to re-open.
Thanks for your reply.
I have updated the commit message as suggested by Eric. Even though I have read
Documentation/networking/netdev-FAQ.rst as you mentioned. I am now still a
little
confused about the subject-prefix of v3 (net or net-next). And David Miller
saied
the net-next tree is CLOSED.
Could you help me check whether the following v3 patch is ok?
Subject: [PATCH net v3] vxlan: remove the redundant gro_cells_destroy() calling.
OR
Subject: [PATCH net-next v3] vxlan: remove the redundant gro_cells_destroy()
calling.
Commit ad6c9986bcb62 ("vxlan: Fix GRO cells race condition between
receive and link delete") fixed a race condition for the typical case a vxlan
device is dismantled from the current netns. But if a netns is dismantled,
vxlan_destroy_tunnels() is called to schedule a unregister_netdevice_queue()
of all the vxlan tunnels that are related to this netns.
In vxlan_destroy_tunnels(), gro_cells_destroy() is called and finished before
unregister_netdevice_queue(). This means that the gro_cells_destroy() call is
done too soon, for the same reasons explained in above commit.
So we need to fully respect the RCU rules, and thus must remove the
gro_cells_destroy() call or risk use after-free.
Fixes: 58ce31cca1ff ("vxlan: GRO support at tunnel layer")
Signed-off-by: Suanming.Mou <[email protected]>
Suggested-by: Eric Dumazet <[email protected]>
Reviewed-by: Stefano Brivio <[email protected]>
Reviewed-by: Zhiqiang Liu <[email protected]>
---
V1->V2:
- update the commit message suggeted by Eric Dumazet
- update Fixes: tag
drivers/net/vxlan.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 077f1b9f2761..d76dfed8d9bb 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -4335,10 +4335,8 @@ static void vxlan_destroy_tunnels(struct net *net,
struct list_head *head)
/* If vxlan->dev is in the same netns, it has already been added
* to the list by the previous loop.
*/
- if (!net_eq(dev_net(vxlan->dev), net)) {
- gro_cells_destroy(&vxlan->gro_cells);
+ if (!net_eq(dev_net(vxlan->dev), net))
unregister_netdevice_queue(vxlan->dev, head);
- }
}
for (h = 0; h < PORT_HASH_SIZE; ++h)
--
1.7.12.4