On 7/5/18 1:57 AM, Xin Long wrote: > On Thu, Jul 5, 2018 at 2:36 AM, David Ahern <dsah...@gmail.com> wrote: >> On 7/4/18 11:56 AM, Xin Long wrote: >> >>>> your commands are not a proper test. The test should succeed and fail >>>> based on the routing lookup, not iptables rules. >>> A proper test can be done easily with netns, as vrf can't isolate much. >>> I don't want to bother forwarding/ directory with netns, so I will probably >>> just drop this selftest, and let the feature patch go first. >>> >> >> BTW, VRF isolates at the routing layer and this is a routing change. We >> need to understand why it does not work with VRF. Perhaps another tweak >> is needed for VRF. > One problem was that the peer may not use the address on the dev > that echo_request comes from as the src IP of echo_reply when the > echo_request's dst IP is broadcast, but try to get another one by > looking up a route without ".flowi4_oif" set. See: > > icmp_reply()->fib_compute_spec_dst(): > struct flowi4 fl4 = { > .flowi4_iif = LOOPBACK_IFINDEX, > .daddr = ip_hdr(skb)->saddr, > .flowi4_tos = RT_TOS(ip_hdr(skb)->tos), > .flowi4_scope = scope, > .flowi4_mark = IN_DEV_SRC_VMARK(in_dev) ? skb->mark : > 0, > }; > if (!fib_lookup(net, &fl4, &res, 0)) > return FIB_RES_PREFSRC(net, res); > > > Without ".flowi4_oif" set, it won't match the vrf route. That's why > I had to make h2 NOT into a vrf so that h1 can get the echo_reply. > But it can't tell if this echo_reply is from h2 or r1, as r1's echo_reply > will also use the same src IP which is actually got from main route > space as ".flowi4_oif" is not set. > (hope I this description is clear to you) :) > > So i'm not sure if we can do any tweak for VRF. >
Try this: diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index b21833651394..e46cdd310e5f 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -300,6 +300,7 @@ __be32 fib_compute_spec_dst(struct sk_buff *skb) if (!ipv4_is_zeronet(ip_hdr(skb)->saddr)) { struct flowi4 fl4 = { .flowi4_iif = LOOPBACK_IFINDEX, + .flowi4_oif = l3mdev_master_ifindex_rcu(dev), .daddr = ip_hdr(skb)->saddr, .flowi4_tos = RT_TOS(ip_hdr(skb)->tos), .flowi4_scope = scope,