On Thu, Jul 5, 2018 at 9:18 PM, David Ahern <dsah...@gmail.com> wrote: > On 7/5/18 1:57 AM, Xin Long wrote: >> On Thu, Jul 5, 2018 at 2:36 AM, David Ahern <dsah...@gmail.com> wrote: >>> On 7/4/18 11:56 AM, Xin Long wrote: >>> >>>>> your commands are not a proper test. The test should succeed and fail >>>>> based on the routing lookup, not iptables rules. >>>> A proper test can be done easily with netns, as vrf can't isolate much. >>>> I don't want to bother forwarding/ directory with netns, so I will probably >>>> just drop this selftest, and let the feature patch go first. >>>> >>> >>> BTW, VRF isolates at the routing layer and this is a routing change. We >>> need to understand why it does not work with VRF. Perhaps another tweak >>> is needed for VRF. >> One problem was that the peer may not use the address on the dev >> that echo_request comes from as the src IP of echo_reply when the >> echo_request's dst IP is broadcast, but try to get another one by >> looking up a route without ".flowi4_oif" set. See: >> >> icmp_reply()->fib_compute_spec_dst(): >> struct flowi4 fl4 = { >> .flowi4_iif = LOOPBACK_IFINDEX, >> .daddr = ip_hdr(skb)->saddr, >> .flowi4_tos = RT_TOS(ip_hdr(skb)->tos), >> .flowi4_scope = scope, >> .flowi4_mark = IN_DEV_SRC_VMARK(in_dev) ? skb->mark >> : 0, >> }; >> if (!fib_lookup(net, &fl4, &res, 0)) >> return FIB_RES_PREFSRC(net, res); >> >> >> Without ".flowi4_oif" set, it won't match the vrf route. That's why >> I had to make h2 NOT into a vrf so that h1 can get the echo_reply. >> But it can't tell if this echo_reply is from h2 or r1, as r1's echo_reply >> will also use the same src IP which is actually got from main route >> space as ".flowi4_oif" is not set. >> (hope I this description is clear to you) :) >> >> So i'm not sure if we can do any tweak for VRF. >> > > Try this: > > diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c > index b21833651394..e46cdd310e5f 100644 > --- a/net/ipv4/fib_frontend.c > +++ b/net/ipv4/fib_frontend.c > @@ -300,6 +300,7 @@ __be32 fib_compute_spec_dst(struct sk_buff *skb) > if (!ipv4_is_zeronet(ip_hdr(skb)->saddr)) { > struct flowi4 fl4 = { > .flowi4_iif = LOOPBACK_IFINDEX, > + .flowi4_oif = l3mdev_master_ifindex_rcu(dev), > .daddr = ip_hdr(skb)->saddr, > .flowi4_tos = RT_TOS(ip_hdr(skb)->tos), > .flowi4_scope = scope, Great, with your fix, I can extend more for this selftest. but I hope no side effects would be caused.
Thank you.