[PATCH net v3] net: fix race between napi kthread mode and busy poll

2021-03-02 Thread Wei Wang
y: Martin Zaharinov Suggested-by: Jakub Kicinski Signed-off-by: Wei Wang Cc: Alexander Duyck Cc: Eric Dumazet Cc: Paolo Abeni Cc: Hannes Frederic Sowa --- include/linux/netdevice.h | 2 ++ net/core/dev.c| 14 +- 2 files changed, 15 insertions(+), 1 deletion(-) d

Re: [PATCH v3 1/2] net: add support for threaded NAPI polling

2020-08-25 Thread Wei Wang
On Fri, Aug 21, 2020 at 12:03 PM Felix Fietkau wrote: > > For some drivers (especially 802.11 drivers), doing a lot of work in the NAPI > poll function does not perform well. Since NAPI poll is bound to the CPU it > was scheduled from, we can easily end up with a few very busy CPUs spending > most

Re: [PATCH v2] net: add support for threaded NAPI polling

2020-08-06 Thread Wei Wang
Nice patch! One question inline. On Thu, Aug 6, 2020 at 2:58 AM Felix Fietkau wrote: > > For some drivers (especially 802.11 drivers), doing a lot of work in the NAPI > poll function does not perform well. Since NAPI poll is bound to the CPU it > was scheduled from, we can easily end up with a fe

[PATCH net] ipv4: fix race condition between route lookup and invalidation

2019-10-16 Thread Wei Wang
ed list, so user could still be able to use it to receive packets until it's done. Fixes: 95c47f9cf5e0 ("ipv4: call dst_dev_put() properly") Signed-off-by: Wei Wang Reported-by: Ido Schimmel Reported-by: Jesse Hathaway Tested-by: Jesse Hathaway Acked-by: Martin KaFai Lau C

Re: Race condition in route lookup

2019-10-16 Thread Wei Wang
On Tue, Oct 15, 2019 at 11:39 PM Martin Lau wrote: > > On Tue, Oct 15, 2019 at 09:44:11AM -0700, Wei Wang wrote: > > On Tue, Oct 15, 2019 at 7:29 AM Jesse Hathaway > > wrote: > > > > > > On Fri, Oct 11, 2019 at 12:54 PM Wei Wang wrote: > > > > H

Re: Race condition in route lookup

2019-10-15 Thread Wei Wang
On Tue, Oct 15, 2019 at 7:29 AM Jesse Hathaway wrote: > > On Fri, Oct 11, 2019 at 12:54 PM Wei Wang wrote: > > Hmm... Yes... I would think a per-CPU input cache should work for the > > case above. > > Another idea is: instead of calling dst_dev_put() in rt_cache_route() &

Re: Race condition in route lookup

2019-10-15 Thread Wei Wang
On Tue, Oct 15, 2019 at 7:45 AM David Ahern wrote: > > On 10/14/19 1:26 PM, Martin Lau wrote: > > > > AFAICT, even for the route that are affected by > > fib6_update_sernum_upto_root(), > > I don't see the RTF_PCPU route is re-created. v6 sk does > > dst_check() => re-lookup the fib6 => > > foun

Re: Race condition in route lookup

2019-10-13 Thread Wei Wang
On Fri, Oct 11, 2019 at 11:56 PM Martin Lau wrote: > > On Fri, Oct 11, 2019 at 10:54:13AM -0700, Wei Wang wrote: > > On Fri, Oct 11, 2019 at 8:42 AM Ido Schimmel wrote: > > > > > > On Fri, Oct 11, 2019 at 09:36:51AM -0500, Jesse Hathaway wrote: > > >

Re: Race condition in route lookup

2019-10-11 Thread Wei Wang
On Fri, Oct 11, 2019 at 11:25 AM Ido Schimmel wrote: > > On Fri, Oct 11, 2019 at 09:17:42PM +0300, Ido Schimmel wrote: > > On Fri, Oct 11, 2019 at 10:54:13AM -0700, Wei Wang wrote: > > > On Fri, Oct 11, 2019 at 8:42 AM Ido Schimmel wrote: > > > > > > >

Re: Race condition in route lookup

2019-10-11 Thread Wei Wang
On Fri, Oct 11, 2019 at 8:42 AM Ido Schimmel wrote: > > On Fri, Oct 11, 2019 at 09:36:51AM -0500, Jesse Hathaway wrote: > > On Thu, Oct 10, 2019 at 3:31 AM Ido Schimmel wrote: > > > I think it's working as expected. Here is my theory: > > > > > > If CPU0 is executing both the route get request an

Re: [PATCHv2 next] blackhole_netdev: fix syzkaller reported issue

2019-10-10 Thread Wei Wang
On Thu, Oct 10, 2019 at 9:48 AM Mahesh Bandewar wrote: > > While invalidating the dst, we assign backhole_netdev instead of > loopback device. However, this device does not have idev pointer > and hence no ip6_ptr even if IPv6 is enabled. Possibly this has > triggered the syzbot reported crash. >

Re: [PATCH v2 net-next] ipv6: Convert gateway validation to use fib6_info

2019-06-25 Thread Wei Wang
fib-onlink-tests.sh and fib_tests.sh are used to > verify the changes. > > Signed-off-by: David Ahern Reviewed-by: Wei Wang > --- > v2 > - use in6_dev_get versus __in6_dev_get + in6_dev_hold (comment from Wei) > - updated commit message > > net/ipv6/route.c | 118 &g

[PATCH v3 net-next 1/5] ipv6: introduce RT6_LOOKUP_F_DST_NOREF flag in ip6_pol_route()

2019-06-20 Thread Wei Wang
From: Wei Wang This new flag is to instruct the route lookup function to not take refcnt on the dst entry. The user which does route lookup with this flag must properly use rcu protection. ip6_pol_route() is the major route lookup function for both tx and rx path. In this function: Do not take

[PATCH v3 net-next 4/5] ipv6: convert rx data path to not take refcnt on dst

2019-06-20 Thread Wei Wang
From: Wei Wang ip6_route_input() is the key function to do the route lookup in the rx data path. All the callers to this function are already holding rcu lock. So it is fairly easy to convert it to not take refcnt on the dst: We pass in flag RT6_LOOKUP_F_DST_NOREF and do skb_dst_set_noref

[PATCH v3 net-next 2/5] ipv6: initialize rt6->rt6i_uncached in all pre-allocated dst entries

2019-06-20 Thread Wei Wang
From: Wei Wang Initialize rt6->rt6i_uncached on the following pre-allocated dsts: net->ipv6.ip6_null_entry net->ipv6.ip6_prohibit_entry net->ipv6.ip6_blk_hole_entry This is a preparation patch for later commits to be able to distinguish dst entries in uncached list by doing: !li

[PATCH v3 net-next 3/5] ipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic

2019-06-20 Thread Wei Wang
From: Wei Wang This patch specifically converts the rule lookup logic to honor this flag and not release refcnt when traversing each rule and calling lookup() on each routing table. Similar to previous patch, we also need some special handling of dst entries in uncached list because there is

[PATCH v3 net-next 0/5] ipv6: avoid taking refcnt on dst during route lookup

2019-06-20 Thread Wei Wang
From: Wei Wang Ipv6 route lookup code always grabs refcnt on the dst for the caller. But for certain cases, grabbing refcnt is not always necessary if the call path is rcu protected and the caller does not cache the dst. Another issue in the route lookup logic is: When there are multiple custom

[PATCH v3 net-next 5/5] ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF

2019-06-20 Thread Wei Wang
From: Wei Wang For tx path, in most cases, we still have to take refcnt on the dst cause the caller is caching the dst somewhere. But it still is beneficial to make use of RT6_LOOKUP_F_DST_NOREF flag while doing the route lookup. It is cause this flag prevents manipulating refcnt on net->i

Re: [PATCH net-next] ipv6: Convert gateway validation to use fib6_info

2019-06-20 Thread Wei Wang
On Thu, Jun 20, 2019 at 12:05 PM David Ahern wrote: > > From: David Ahern > > Gateway validation does not need a dst_entry, it only needs the fib > entry to validate the gateway resolution and egress device. So, > convert ip6_nh_lookup_table from ip6_pol_route to fib6_table_lookup > and ip6_route

Re: [PATCH v2 net-next 5/5] ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF

2019-06-19 Thread Wei Wang
On Wed, Jun 19, 2019 at 4:21 PM David Ahern wrote: > > On 6/19/19 4:31 PM, Wei Wang wrote: > > diff --git a/include/net/l3mdev.h b/include/net/l3mdev.h > > index e942372b077b..d8c37317bb86 100644 > > --- a/include/net/l3mdev.h > > +++ b/include/net/l3mdev.h > >

Re: [PATCH v2 net-next 3/5] ipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic

2019-06-19 Thread Wei Wang
On Wed, Jun 19, 2019 at 4:12 PM David Ahern wrote: > > On 6/19/19 4:31 PM, Wei Wang wrote: > > diff --git a/net/ipv6/fib6_rules.c b/net/ipv6/fib6_rules.c > > index bcfae13409b5..d22b6c140f23 100644 > > --- a/net/ipv6/fib6_rules.c > > +++ b/net/ipv6/fib6_rules.c &g

[PATCH v2 net-next 0/5] ipv6: avoid taking refcnt on dst during route lookup

2019-06-19 Thread Wei Wang
From: Wei Wang Ipv6 route lookup code always grabs refcnt on the dst for the caller. But for certain cases, grabbing refcnt is not always necessary if the call path is rcu protected and the caller does not cache the dst. Another issue in the route lookup logic is: When there are multiple custom

[PATCH v2 net-next 5/5] ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF

2019-06-19 Thread Wei Wang
From: Wei Wang For tx path, in most cases, we still have to take refcnt on the dst cause the caller is caching the dst somewhere. But it still is beneficial to make use of RT6_LOOKUP_F_DST_NOREF flag while doing the route lookup. It is cause this flag prevents manipulating refcnt on net->i

[PATCH v2 net-next 1/5] ipv6: introduce RT6_LOOKUP_F_DST_NOREF flag in ip6_pol_route()

2019-06-19 Thread Wei Wang
From: Wei Wang This new flag is to instruct the route lookup function to not take refcnt on the dst entry. The user which does route lookup with this flag must properly use rcu protection. ip6_pol_route() is the major route lookup function for both tx and rx path. In this function: Do not take

[PATCH v2 net-next 3/5] ipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic

2019-06-19 Thread Wei Wang
From: Wei Wang This patch specifically converts the rule lookup logic to honor this flag and not release refcnt when traversing each rule and calling lookup() on each routing table. Similar to previous patch, we also need some special handling of dst entries in uncached list because there is

[PATCH v2 net-next 2/5] ipv6: initialize rt6->rt6i_uncached in all pre-allocated dst entries

2019-06-19 Thread Wei Wang
From: Wei Wang Initialize rt6->rt6i_uncached on the following pre-allocated dsts: net->ipv6.ip6_null_entry net->ipv6.ip6_prohibit_entry net->ipv6.ip6_blk_hole_entry This is a preparation patch for later commits to be able to distinguish dst entries in uncached list by doing: !li

[PATCH v2 net-next 4/5] ipv6: convert rx data path to not take refcnt on dst

2019-06-19 Thread Wei Wang
From: Wei Wang ip6_route_input() is the key function to do the route lookup in the rx data path. All the callers to this function are already holding rcu lock. So it is fairly easy to convert it to not take refcnt on the dst: We pass in flag RT6_LOOKUP_F_DST_NOREF and do skb_dst_set_noref

Re: [PATCH net-next 3/5] ipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic

2019-06-19 Thread Wei Wang
On Wed, Jun 19, 2019 at 9:07 AM David Miller wrote: > > From: Wei Wang > Date: Tue, 18 Jun 2019 11:25:41 -0700 > > > @@ -237,13 +240,16 @@ static int __fib6_rule_action(struct fib_rule *rule, > > struct flowi *flp, > > goto out; > >

[PATCH net-next 2/5] ipv6: initialize rt6->rt6i_uncached in all pre-allocated dst entries

2019-06-18 Thread Wei Wang
From: Wei Wang Initialize rt6->rt6i_uncached on the following pre-allocated dsts: net->ipv6.ip6_null_entry net->ipv6.ip6_prohibit_entry net->ipv6.ip6_blk_hole_entry This is a preparation patch for later commits to be able to distinguish dst entries in uncached list by doing: !li

[PATCH net-next 1/5] ipv6: introduce RT6_LOOKUP_F_DST_NOREF flag in ip6_pol_route()

2019-06-18 Thread Wei Wang
From: Wei Wang This new flag is to instruct the route lookup function to not take refcnt on the dst entry. The user which does route lookup with this flag must properly use rcu protection. ip6_pol_route() is the major route lookup function for both tx and rx path. In this function: Do not take

[PATCH net-next 4/5] ipv6: convert rx data path to not take refcnt on dst

2019-06-18 Thread Wei Wang
From: Wei Wang ip6_route_input() is the key function to do the route lookup in the rx data path. All the callers to this function are already holding rcu lock. So it is fairly easy to convert it to not take refcnt on the dst: We pass in flag RT6_LOOKUP_F_DST_NOREF and do skb_dst_set_noref

[PATCH net-next 5/5] ipv6: convert major tx path to use RT6_LOOKUP_F_DST_NOREF

2019-06-18 Thread Wei Wang
From: Wei Wang For tx path, in most cases, we still have to take refcnt on the dst cause the caller is caching the dst somewhere. But it still is beneficial to make use of RT6_LOOKUP_F_DST_NOREF flag while doing the route lookup. It is cause this flag prevents manipulating refcnt on net->i

[PATCH net-next 3/5] ipv6: honor RT6_LOOKUP_F_DST_NOREF in rule lookup logic

2019-06-18 Thread Wei Wang
From: Wei Wang This patch specifically converts the rule lookup logic to honor this flag and not release refcnt when traversing each rule and calling lookup() on each routing table. Similar to previous patch, we also need some special handling of dst entries in uncached list because there is

[PATCH net-next 0/5] ipv6: avoid taking refcnt on dst during route lookup

2019-06-18 Thread Wei Wang
From: Wei Wang Ipv6 route lookup code always grabs refcnt on the dst for the caller. But for certain cases, grabbing refcnt is not always necessary if the call path is rcu protected and the caller does not cache the dst. Another issue in the route lookup logic is: When there are multiple custom

Re: [PATCH v4 net-next 00/20] net: Enable nexthop objects with IPv4 and IPv6 routes

2019-06-09 Thread Wei Wang
| 31 +- > net/ipv6/route.c | 458 > +++-- > .../selftests/net/fib_nexthop_multiprefix.sh | 290 + > .../selftests/net/forwarding/router_mpath_nh.sh| 359 ++++++++ > tools/testing/selftests/net/icmp_redirect.sh | 49 +++ > tools/testing/selftests/net/pmtu.sh| 237 --- > 12 files changed, 1672 insertions(+), 113 deletions(-) > create mode 100755 tools/testing/selftests/net/fib_nexthop_multiprefix.sh > create mode 100755 tools/testing/selftests/net/forwarding/router_mpath_nh.sh > > -- > 2.11.0 > For all ipv6 patches: Reviewed-By: Wei Wang

Re: [PATCH v3 net-next 09/20] ipv6: Handle all fib6_nh in a nexthop in rt6_do_redirect

2019-06-07 Thread Wei Wang
On Fri, Jun 7, 2019 at 4:06 PM David Ahern wrote: > > From: David Ahern > > Use nexthop_for_each_fib6_nh and fib6_nh_find_match to find the > fib6_nh in a nexthop that correlates to the device and gateway > in the rt6_info. > > Signed-off-by: David Ahern > --- > net/ipv6/route.c | 20 ++

Re: [PATCH v2 net-next 07/20] ipv6: Handle all fib6_nh in a nexthop in exception handling

2019-06-07 Thread Wei Wang
On Fri, Jun 7, 2019 at 8:09 AM David Ahern wrote: > > From: David Ahern > > Add a hook in rt6_flush_exceptions, rt6_remove_exception_rt, > rt6_update_exception_stamp_rt, and rt6_age_exceptions to handle > nexthop struct in a fib6_info. > > Signed-off-by: David Ahern > --- > net/ipv6/route.c | 1

Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info

2019-06-04 Thread Wei Wang
On Tue, Jun 4, 2019 at 2:13 PM David Ahern wrote: > > On 6/4/19 3:06 PM, Martin Lau wrote: > > On Tue, Jun 04, 2019 at 02:17:28PM -0600, David Ahern wrote: > >> On 6/3/19 11:29 PM, Martin Lau wrote: > >>> On Mon, Jun 03, 2019 at 07:36:06PM -0600, David Ahern wrote: > On 6/3/19 6:58 PM, Martin

Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info

2019-06-03 Thread Wei Wang
On Mon, Jun 3, 2019 at 4:18 PM David Ahern wrote: > > On 6/3/19 5:05 PM, Wei Wang wrote: > > On Mon, Jun 3, 2019 at 3:35 PM David Ahern wrote: > >> > >> On 6/3/19 3:58 PM, Wei Wang wrote: > >>> Hmm... I am still a bit concerned with the ip6_create_rt

Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info

2019-06-03 Thread Wei Wang
On Mon, Jun 3, 2019 at 3:35 PM David Ahern wrote: > > On 6/3/19 3:58 PM, Wei Wang wrote: > > Hmm... I am still a bit concerned with the ip6_create_rt_rcu() call. > > If we have a blackholed nexthop, the lookup code here always tries to > > create an rt cache entry for ev

Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info

2019-06-03 Thread Wei Wang
On Mon, Jun 3, 2019 at 1:42 PM David Ahern wrote: > > On 6/3/19 12:09 PM, Wei Wang wrote: > >> diff --git a/net/ipv6/route.c b/net/ipv6/route.c > >> index fada5a13bcb2..51cb5cb027ae 100644 > >> --- a/net/ipv6/route.c > >> +++ b/net/ipv6/route.c > &

Re: [PATCH v2 net-next 4/7] ipv6: Plumb support for nexthop object in a fib6_info

2019-06-03 Thread Wei Wang
On Sun, Jun 2, 2019 at 9:08 PM David Ahern wrote: > > From: David Ahern > > Add struct nexthop and nh_list list_head to fib6_info. nh_list is the > fib6_info side of the nexthop <-> fib_info relationship. Since a fib6_info > referencing a nexthop object can not have 'sibling' entries (the old way

[PATCH v3 net] ipv6: fix src addr routing with the exception table

2019-05-16 Thread Wei Wang
From: Wei Wang When inserting route cache into the exception table, the key is generated with both src_addr and dest_addr with src addr routing. However, current logic always assumes the src_addr used to generate the key is a /128 host address. This is not true in the following scenarios: 1

Re: [PATCH v2 net] ipv6: fix src addr routing with the exception table

2019-05-16 Thread Wei Wang
On Thu, May 16, 2019 at 12:15 PM Martin Lau wrote: > > On Thu, May 16, 2019 at 11:16:20AM -0700, Wei Wang wrote: > > From: Wei Wang > > > > When inserting route cache into the exception table, the key is > > generated with both src_addr and dest_addr with src addr

[PATCH v2 net] ipv6: fix src addr routing with the exception table

2019-05-16 Thread Wei Wang
From: Wei Wang When inserting route cache into the exception table, the key is generated with both src_addr and dest_addr with src addr routing. However, current logic always assumes the src_addr used to generate the key is a /128 host address. This is not true in the following scenarios: 1

Re: [PATCH net] ipv6: fix src addr routing with the exception table

2019-05-15 Thread Wei Wang
On Wed, May 15, 2019 at 5:07 PM David Ahern wrote: > > On 5/15/19 6:03 PM, Wei Wang wrote: > > Thanks Martin. > > Changing __rt6_find_exception_xxx() might not be easy cause other > > callers of this function does not really need to back off and use > > another sa

Re: [PATCH net] ipv6: prevent possible fib6 leaks

2019-05-15 Thread Wei Wang
> > I decided to add another boolean (fib6_destroying) instead > of reusing/renaming exception_bucket_flushed to ease stable backports, > and properly document the memory barriers used to implement this fix. > > This patch has been co-developped with Wei Wang. > > Fixes: 93531c674315 (&

Re: [PATCH net] ipv6: fix src addr routing with the exception table

2019-05-15 Thread Wei Wang
On Wed, May 15, 2019 at 2:51 PM Martin Lau wrote: > > On Tue, May 14, 2019 at 05:46:10PM -0700, Wei Wang wrote: > > From: Wei Wang > > > > When inserting route cache into the exception table, the key is > > generated with both src_addr and dest_addr with src addr

Re: IPv6 PMTU discovery fails with source-specific routing

2019-05-15 Thread Wei Wang
On Wed, May 15, 2019 at 11:06 AM Martin Lau wrote: > > On Tue, May 14, 2019 at 12:33:25PM -0700, Wei Wang wrote: > > I think the bug is because when creating exceptions, src_addr is not > > always set even though fib6_info is in the subtree. (because of > > rt6_is

Re: [PATCH net] ipv6: fix src addr routing with the exception table

2019-05-15 Thread Wei Wang
From: David Ahern Date: Wed, May 15, 2019 at 10:33 AM To: Wei Wang Cc: Wei Wang, David Miller, Linux Kernel Network Developers, Martin KaFai Lau, Mikael Magnusson, Eric Dumazet > On 5/15/19 11:28 AM, Wei Wang wrote: > > From: Wei Wang > > Date: Wed, May 15, 2019 at 10:25 AM >

Re: [PATCH net] ipv6: fix src addr routing with the exception table

2019-05-15 Thread Wei Wang
From: Wei Wang Date: Wed, May 15, 2019 at 10:25 AM To: David Ahern Cc: Wei Wang, David Miller, Linux Kernel Network Developers, Martin KaFai Lau, Mikael Magnusson, Eric Dumazet > > > > What about rt6_remove_exception_rt? > > > > You can add a 'cache' hook to

Re: [PATCH net] ipv6: fix src addr routing with the exception table

2019-05-15 Thread Wei Wang
) calls rt6_find_cached_rt() to find the cached route first. And rt6_find_cached_rt() is taken care of to find the cached route according to both passed in src addr and f6i->fib6_src. So I think we are good here. From: David Ahern Date: Wed, May 15, 2019 at 9:38 AM To: Wei Wang, David Miller,

[PATCH net] ipv6: fix src addr routing with the exception table

2019-05-14 Thread Wei Wang
From: Wei Wang When inserting route cache into the exception table, the key is generated with both src_addr and dest_addr with src addr routing. However, current logic always assumes the src_addr used to generate the key is a /128 host address. This is not true in the following scenarios: 1

Re: IPv6 PMTU discovery fails with source-specific routing

2019-05-14 Thread Wei Wang
#endif } Why do we need to check that the route is not gateway and has next hop for updating rt6i_src? I checked the git history and it seems this part was there from very early on (with some refactor in between)... From: Stefano Brivio Date: Tue, May 14, 2019 at 7:33 AM To: Mikael Mag

Re: IPv6 PMTU discovery fails with source-specific routing

2019-05-13 Thread Wei Wang
Thanks Mikael for reporting this issue. And thanks David for the bisection. Let me spend some time to reproduce it and see what is going on. From: David Ahern Date: Mon, May 13, 2019 at 8:35 PM To: Mikael Magnusson, , Martin KaFai Lau, Wei Wang > On 5/13/19 1:22 PM, Mikael Magnusson wr

Re: [PATCH net] ipv6: A few fixes on dereferencing rt->from

2019-04-30 Thread Wei Wang
king is also needed on rt->from for a similar reason. >Note that inet6_rtm_getroute() is using RTNL_FLAG_DOIT_UNLOCKED. > > Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") > Signed-off-by: Martin KaFai Lau > --- Acked-by: Wei Wang Nice fix. Tha

Re: [PATCH net] ipv6: fix races in ip6_dst_destroy()

2019-04-29 Thread Wei Wang
c R09: 0004c0d1 > R10: 02341940 R11: 000000000246 R12: > R13: 7ffeafc2a7f0 R14: 0004c065 R15: 7ffeafc2a800 > > Fixes: a68886a69180 ("net/ipv6: Make from in rt6_info rcu protected") > Signed-off-by: Eric

Re: [PATCH net-next 3/3] ipv6: convert fib6_ref to refcount_t

2019-04-23 Thread Wei Wang
On Mon, Apr 22, 2019 at 6:35 PM Eric Dumazet wrote: > > We suspect some issues involving fib6_ref 0 -> 1 transitions might > cause strange syzbot reports. > > Lets convert fib6_ref to refcount_t to catch them earlier. > > Signed-off-by: Eric Dumazet > Cc: Wei Wan

Re: [PATCH net-next 1/3] ipv6: fib6_info_destroy_rcu() cleanup

2019-04-23 Thread Wei Wang
sh_exceptions() > under the protection of rt6_exception_lock. > > Signed-off-by: Eric Dumazet > Cc: Wei Wang > --- Acked-by: Wei Wang > net/ipv6/ip6_fib.c | 5 + > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib

Re: [PATCH net-next 2/3] ipv6: broadly use fib6_info_hold() helper

2019-04-23 Thread Wei Wang
Eric Dumazet > Cc: Wei Wang > --- Acked-by: Wei Wang > net/ipv6/ip6_fib.c | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c > index > 55193859152969794dab3df02637217a7f21016f..a5e83593e0e45c

Re: [PATCH net-next] net: dst: remove gc leftovers

2019-03-20 Thread Wei Wang
On Wed, Mar 20, 2019 at 12:03 PM Julian Wiedmann wrote: > > Get rid of some obsolete gc-related documentation and macros that were > missed in commit 5b7c9a8ff828 ("net: remove dst gc related code"). > > CC: Wei Wang > Signed-off-by: Julian Wiedmann > --- Acked-b

Re: [PATCH net-next] ipv6: Remove fallback argument from ip6_hold_safe

2019-03-20 Thread Wei Wang
On Wed, Mar 20, 2019 at 9:24 AM David Ahern wrote: > > From: David Ahern > > net and null_fallback are redundant. Remove null_fallback in favor of > !net check. > > Signed-off-by: David Ahern > --- Acked-by: Wei Wang > net/ipv6/route.c | 13 ++--- >

Re: [PATCHv2 net] ipv6: make ip6_create_rt_rcu return ip6_null_entry instead of NULL

2019-03-20 Thread Wei Wang
> > So we fix it by simply making ip6_create_rt_rcu() return ip6_null_entry > instead of NULL. > > v1->v2: > - move down 'fallback:' to make it more readable. > > Fixes: e873e4b9cc7e ("ipv6: use fib6_info_hold_safe() when necessary") > Reported-by:

Re: [PATCH net] ipv6: make ip6_create_rt_rcu return ip6_null_entry instead of NULL

2019-03-18 Thread Wei Wang
On Mon, Mar 18, 2019 at 12:48 PM David Ahern wrote: > > On 3/18/19 12:36 PM, Xin Long wrote: > > diff --git a/net/ipv6/route.c b/net/ipv6/route.c > > index 4ef4bbd..754777d 100644 > > --- a/net/ipv6/route.c > > +++ b/net/ipv6/route.c > > @@ -1040,13 +1040,17 @@ static struct rt6_info *ip6_create_r

[PATCH net-next 0/2] tcp: change pingpong to 3 in delayed ack logic

2019-01-25 Thread Wei Wang
. Wei Wang (2): tcp: Refactor pingpong code tcp: change pingpong threshold to 3 include/net/inet_connection_sock.h | 25 + net/dccp/input.c | 2 +- net/dccp/timer.c | 4 ++-- net/ipv4/tcp.c | 10 +- net

[PATCH net-next 1/2] tcp: Refactor pingpong code

2019-01-25 Thread Wei Wang
s a pure refactor and sets foundation for the next patch. This patch itself does not change any pingpong logic. Signed-off-by: Wei Wang Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet --- include/net/inet_connection_sock.h | 17 + net/dccp/input.c |

[PATCH net-next 2/2] tcp: change pingpong threshold to 3

2019-01-25 Thread Wei Wang
pattern afterwards. Signed-off-by: Wei Wang Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet --- include/net/inet_connection_sock.h | 10 +- net/ipv4/tcp_output.c | 15 +-- 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/include/net

Re: [PATCH iproute2] ss: add support for bytes_sent, bytes_retrans, dsack_dups and reord_seen

2018-11-29 Thread Wei Wang
On Thu, Nov 29, 2018 at 2:28 AM Eric Dumazet wrote: > > Wei Wang added these fields in linux-4.19 > > Tested: > > ss -ti ... > > ts sack cubic wscale:8,8 rto:7 rtt:2.678/0.267 mss:1428 pmtu:1500 > rcvmss:536 advmss:1428 cwnd:91 ssthresh:65 > (*) bytes_

[PATCH net] ipv6: take rcu lock in rawv6_send_hdrinc()

2018-10-04 Thread Wei Wang
From: Wei Wang In rawv6_send_hdrinc(), in order to avoid an extra dst_hold(), we directly assign the dst to skb and set passed in dst to NULL to avoid double free. However, in error case, we free skb and then do stats update with the dst pointer passed in. This causes use-after-free on the dst

Re: [PATCH net 2/2] ipv6: fix memory leak on dst->_metrics

2018-09-18 Thread Wei Wang
On Tue, Sep 18, 2018 at 4:25 PM David Ahern wrote: > > On 9/18/18 1:45 PM, Wei Wang wrote: > > From: Wei Wang > > > > When dst->_metrics and f6i->fib6_metrics share the same memory, both > > take reference count on the dst_metrics structure. However, when d

[PATCH net 2/2] ipv6: fix memory leak on dst->_metrics

2018-09-18 Thread Wei Wang
From: Wei Wang When dst->_metrics and f6i->fib6_metrics share the same memory, both take reference count on the dst_metrics structure. However, when dst is destroyed, ip6_dst_destroy() only invokes dst_destroy_metrics_generic() which does not take care of READONLY metrics and does not r

[PATCH net 1/2] Revert "ipv6: fix double refcount of fib6_metrics"

2018-09-18 Thread Wei Wang
From: Wei Wang This reverts commit e70a3aad44cc8b24986687ffc98c4a4f6ecf25ea. This change causes use-after-free on dst->_metrics. The crash trace looks like this: [ 97.763269] BUG: KASAN: use-after-free in ip6_mtu+0x116/0x140 [ 97.769038] Read of size 4 at addr 881781d2cf84 by t

[PATCH net 0/2] ipv6: fix issues on accessing fib6_metrics

2018-09-18 Thread Wei Wang
From: Wei Wang The latest fix on the memory leak of fib6_metrics still causes use-after-free. This patch series first revert the previous fix and propose a new fix that is more inline with ipv4 logic and is tested to fix the use-after-free issue reported. Wei Wang (2): Revert "ipv6

Re: BUG: unable to handle kernel paging request in fib6_node_lookup_1

2018-09-05 Thread Wei Wang
On Tue, Sep 4, 2018 at 11:11 PM Song Liu wrote: > > We are debugging an issue with fib6_node_lookup_1(). > > We use a 4.16 based kernel, and we have back ported most upstream > patches in ip6_fib.{c.h}. The only major differences I can spot are > > 8b7f2731bd68d83940714ce92381d1a72596407c > c35063

[PATCH net v2] l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cache

2018-08-10 Thread Wei Wang
From: Wei Wang In l2tp code, if it is a L2TP_UDP_ENCAP tunnel, tunnel->sk points to a UDP socket. User could call sendmsg() on both this tunnel and the UDP socket itself concurrently. As l2tp_xmit_skb() holds socket lock and call __sk_dst_check() to refresh sk->sk_dst_cache, while udpv6_s

[PATCH net] l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cache

2018-08-09 Thread Wei Wang
From: Wei Wang In l2tp code, if it is a L2TP_UDP_ENCAP tunnel, tunnel->sk points to a UDP socket. User could call sendmsg() on both this tunnel and the UDP socket itself concurrently. As l2tp_xmit_skb() holds socket lock and call __sk_dst_check() to refresh sk->sk_dst_cache, while udpv6_s

[PATCH v2 net-next 0/5] tcp: add 4 new stats

2018-07-31 Thread Wei Wang
From: Wei Wang This patch series adds 3 RFC4898 stats: 1. tcpEStatsPerfHCDataOctetsOut 2. tcpEStatsPerfOctetsRetrans 3. tcpEStatsStackDSACKDups and an addtional stat to record the number of data packet reordering events seen: 4. tcp_reord_seen Together with the existing stats, application can

[PATCH v2 net-next 2/5] tcp: add data bytes sent stats

2018-07-31 Thread Wei Wang
From: Wei Wang Introduce a new TCP stat to record the number of bytes sent (RFC4898 tcpEStatsPerfHCDataOctetsOut) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Acked-by: Soheil

[PATCH v2 net-next 5/5] tcp: add stat of data packet reordering events

2018-07-31 Thread Wei Wang
From: Wei Wang Introduce a new TCP stats to record the number of reordering events seen and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Application can use this stats to track the frequency of the reordering events in addition to the existing reordering

[PATCH v2 net-next 4/5] tcp: add dsack blocks received stats

2018-07-31 Thread Wei Wang
From: Wei Wang Introduce a new TCP stat to record the number of DSACK blocks received (RFC4989 tcpEStatsStackDSACKDups) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Acked-by

[PATCH v2 net-next 3/5] tcp: add data bytes retransmitted stats

2018-07-31 Thread Wei Wang
From: Wei Wang Introduce a new TCP stat to record the number of bytes retransmitted (RFC4898 tcpEStatsPerfOctetsRetrans) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Acked-by

[PATCH v2 net-next 1/5] tcp: add a helper to calculate size of opt_stats

2018-07-31 Thread Wei Wang
From: Wei Wang This is to refactor the calculation of the size of opt_stats to a helper function to make the code cleaner and easier for later changes. Suggested-by: Stephen Hemminger Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Acked-by: Soheil Hassas Yeganeh

Re: [PATCH net-next 2/4] tcp: add data bytes retransmitted stats

2018-07-30 Thread Wei Wang
On Mon, Jul 30, 2018 at 3:14 PM Stephen Hemminger wrote: > > On Mon, 30 Jul 2018 14:59:09 -0700 > Wei Wang wrote: > > > + stats = alloc_skb(9 * nla_total_size_64bit(sizeof(u64)) + > > 7 * nla_total_size(sizeof(u32)) + > >

[PATCH net-next 0/4] tcp: add 4 new stats

2018-07-30 Thread Wei Wang
From: Wei Wang This patch series adds 3 RFC4898 stats: 1. tcpEStatsPerfHCDataOctetsOut 2. tcpEStatsPerfOctetsRetrans 3. tcpEStatsStackDSACKDups and an addtional stat to record the number of data packet reordering events seen: 4. tcp_reord_seen Together with the existing stats, application can

[PATCH net-next 2/4] tcp: add data bytes retransmitted stats

2018-07-30 Thread Wei Wang
From: Wei Wang Introduce a new TCP stat to record the number of bytes retransmitted (RFC4898 tcpEStatsPerfOctetsRetrans) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Acked-by

[PATCH net-next 1/4] tcp: add data bytes sent stats

2018-07-30 Thread Wei Wang
From: Wei Wang Introduce a new TCP stat to record the number of bytes sent (RFC4898 tcpEStatsPerfHCDataOctetsOut) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Acked-by: Soheil

[PATCH net-next 3/4] tcp: add dsack blocks received stats

2018-07-30 Thread Wei Wang
From: Wei Wang Introduce a new TCP stat to record the number of DSACK blocks received (RFC4989 tcpEStatsStackDSACKDups) and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Signed-off-by: Wei Wang Signed-off-by: Eric Dumazet Acked-by: Neal Cardwell Acked-by

[PATCH net-next 4/4] tcp: add stat of data packet reordering events

2018-07-30 Thread Wei Wang
From: Wei Wang Introduce a new TCP stats to record the number of reordering events seen and expose it in both tcp_info (TCP_INFO) and opt_stats (SOF_TIMESTAMPING_OPT_STATS). Application can use this stats to track the frequency of the reordering events in addition to the existing reordering

[PATCH net] ipv6: use fib6_info_hold_safe() when necessary

2018-07-21 Thread Wei Wang
From: Wei Wang In the code path where only rcu read lock is held, e.g. in the route lookup code path, it is not safe to directly call fib6_info_hold() because the fib6_info may already have been deleted but still exists in the rcu grace period. Holding reference to it could cause double free and

[PATCH net-next] tcp: ignore rcv_rtt sample with old ts ecr value

2018-06-19 Thread Wei Wang
From: Wei Wang When receiving multiple packets with the same ts ecr value, only try to compute rcv_rtt sample with the earliest received packet. This is because the rcv_rtt calculated by later received packets could possibly include long idle time or other types of delay. For example: (1) server

[PATCH bpf-next] bpf: prevent non-IPv4 socket to be added into sock hash

2018-05-30 Thread Wei Wang
From: Wei Wang Sock hash only supports IPv4 socket proto right now. If a non-IPv4 socket gets stored in the BPF map, sk->sk_prot gets overwritten with the v4 tcp prot. Syskaller reported the following related issue on an IPv6 socket: BUG: KASAN: slab-out-of-bounds in ip6_dst_idev include/

[PATCH bpf] bpf: prevent non-ipv4 socket to be added into sock map

2018-05-30 Thread Wei Wang
From: Wei Wang Sock map only supports IPv4 socket proto right now. If a non-IPv4 socket gets stored in the BPF map, sk->sk_prot gets overwritten with the v4 tcp prot. It could potentially cause issues when invoking functions from sk->sk_prot later in the stack. Fixes: 174a79ff9515

[PATCH net-next] tcp: remove mss check in tcp_select_initial_window()

2018-04-26 Thread Wei Wang
From: Wei Wang In tcp_select_initial_window(), we only set rcv_wnd to tcp_default_init_rwnd() if current mss > (1 << wscale). Otherwise, rcv_wnd is kept at the full receive space of the socket which is a value way larger than tcp_default_init_rwnd(). With larger initial rcv_wnd value

Re: [PATCH net] ipv6: fix possible deadlock in rt6_age_examine_exception()

2018-03-23 Thread Wei Wang
_call_chain+0x2d/0x40 kernel/notifier.c:401 > call_netdevice_notifiers_info+0x32/0x70 net/core/dev.c:1707 > call_netdevice_notifiers net/core/dev.c:1725 [inline] > __dev_notify_flags+0x262/0x430 net/core/dev.c:6960 > dev_change_flags+0xf5/0x140 net/core/dev.c:6994 > devinet_ioc

Re: [PATCH RFC net-next 16/20] net/ipv6: Cleanup exception route handling

2018-02-26 Thread Wei Wang
On Mon, Feb 26, 2018 at 3:02 PM, David Ahern wrote: > On 2/26/18 3:29 PM, Wei Wang wrote: >> On Sun, Feb 25, 2018 at 11:47 AM, David Ahern wrote: >>> IPv6 FIB will only contain FIB entries with exception routes added to >>> the FIB entry. Remove CACHE and dst chec

Re: [PATCH RFC net-next 10/20] net/ipv6: move expires into rt6_info

2018-02-26 Thread Wei Wang
On Mon, Feb 26, 2018 at 2:55 PM, David Ahern wrote: > On 2/26/18 3:28 PM, Wei Wang wrote: >>> @@ -213,11 +234,6 @@ static inline void rt6_set_expires(struct rt6_info >>> *rt, unsigned long expires) >>> >>> static inline void rt6_update_ex

Re: [PATCH RFC net-next 07/20] net/ipv6: Move nexthop data to fib6_nh

2018-02-26 Thread Wei Wang
On Mon, Feb 26, 2018 at 2:47 PM, David Ahern wrote: > On 2/26/18 3:28 PM, Wei Wang wrote: >> On Sun, Feb 25, 2018 at 11:47 AM, David Ahern wrote: >>> Introduce fib6_nh structure and move nexthop related data from >>> rt6_info and rt6_info.dst to fib6_nh. Re

Re: [PATCH RFC net-next 16/20] net/ipv6: Cleanup exception route handling

2018-02-26 Thread Wei Wang
On Sun, Feb 25, 2018 at 11:47 AM, David Ahern wrote: > IPv6 FIB will only contain FIB entries with exception routes added to > the FIB entry. Remove CACHE and dst checks from fib6 add and delete since > they can never happen once the data type changes. > > Fixup the lookup functions to use a f6i n

Re: [PATCH RFC net-next 10/20] net/ipv6: move expires into rt6_info

2018-02-26 Thread Wei Wang
On Sun, Feb 25, 2018 at 11:47 AM, David Ahern wrote: > Add expires to rt6_info for FIB entries, and add fib6 helpers to > manage it. Data path use of dst.expires remains. > > Signed-off-by: David Ahern > --- > include/net/ip6_fib.h | 26 +- > net/ipv6/addrconf.c | 6 ++

Re: [PATCH RFC net-next 07/20] net/ipv6: Move nexthop data to fib6_nh

2018-02-26 Thread Wei Wang
On Sun, Feb 25, 2018 at 11:47 AM, David Ahern wrote: > Introduce fib6_nh structure and move nexthop related data from > rt6_info and rt6_info.dst to fib6_nh. References to dev, gateway or > lwtstate from a FIB lookup perspective are converted to use fib6_nh; > datapath references to dst version ar

  1   2   3   >