On Thu, Jul 5, 2018 at 10:07 PM, Xin Long <lucien....@gmail.com> wrote: > On Thu, Jul 5, 2018 at 9:18 PM, David Ahern <dsah...@gmail.com> wrote: >> On 7/5/18 1:57 AM, Xin Long wrote: >>> On Thu, Jul 5, 2018 at 2:36 AM, David Ahern <dsah...@gmail.com> wrote: >>>> On 7/4/18 11:56 AM, Xin Long wrote: >>>> >>>>>> your commands are not a proper test. The test should succeed and fail >>>>>> based on the routing lookup, not iptables rules. >>>>> A proper test can be done easily with netns, as vrf can't isolate much. >>>>> I don't want to bother forwarding/ directory with netns, so I will >>>>> probably >>>>> just drop this selftest, and let the feature patch go first. >>>>> >>>> >>>> BTW, VRF isolates at the routing layer and this is a routing change. We >>>> need to understand why it does not work with VRF. Perhaps another tweak >>>> is needed for VRF. >>> One problem was that the peer may not use the address on the dev >>> that echo_request comes from as the src IP of echo_reply when the >>> echo_request's dst IP is broadcast, but try to get another one by >>> looking up a route without ".flowi4_oif" set. See: >>> >>> icmp_reply()->fib_compute_spec_dst(): >>> struct flowi4 fl4 = { >>> .flowi4_iif = LOOPBACK_IFINDEX, >>> .daddr = ip_hdr(skb)->saddr, >>> .flowi4_tos = RT_TOS(ip_hdr(skb)->tos), >>> .flowi4_scope = scope, >>> .flowi4_mark = IN_DEV_SRC_VMARK(in_dev) ? skb->mark >>> : 0, >>> }; >>> if (!fib_lookup(net, &fl4, &res, 0)) >>> return FIB_RES_PREFSRC(net, res); >>> >>> >>> Without ".flowi4_oif" set, it won't match the vrf route. That's why >>> I had to make h2 NOT into a vrf so that h1 can get the echo_reply. >>> But it can't tell if this echo_reply is from h2 or r1, as r1's echo_reply >>> will also use the same src IP which is actually got from main route >>> space as ".flowi4_oif" is not set. >>> (hope I this description is clear to you) :) >>> >>> So i'm not sure if we can do any tweak for VRF. >>> >> >> Try this: >> >> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c >> index b21833651394..e46cdd310e5f 100644 >> --- a/net/ipv4/fib_frontend.c >> +++ b/net/ipv4/fib_frontend.c >> @@ -300,6 +300,7 @@ __be32 fib_compute_spec_dst(struct sk_buff *skb) >> if (!ipv4_is_zeronet(ip_hdr(skb)->saddr)) { >> struct flowi4 fl4 = { >> .flowi4_iif = LOOPBACK_IFINDEX, >> + .flowi4_oif = l3mdev_master_ifindex_rcu(dev), >> .daddr = ip_hdr(skb)->saddr, >> .flowi4_tos = RT_TOS(ip_hdr(skb)->tos), >> .flowi4_scope = scope, If this patch can be applied, I would be able to make a proper selftest like:
... ping_test_from() { local oif=$1 local dip=$2 local from=$3 local fail=$4 RET=0 ip vrf exec $(master_name_get $oif) \ $PING -I $oif $dip -c 10 -i 0.1 -w 2 -b 2>&1 | grep $from &> /dev/null check_err_fail $fail $? log_test "ping_test_from" } ping_ipv4() { sysctl_set net.ipv4.icmp_echo_ignore_broadcasts 0 bc_forwarding_disable ping_test_from $h1 198.51.100.255 192.0.2.1 ping_test_from $h1 198.51.200.255 192.0.2.1 ping_test_from $h1 192.0.2.255 192.0.2.1 ping_test_from $h1 255.255.255.255 192.0.2.1 ping_test_from $h2 192.0.2.255 198.51.100.1 ping_test_from $h2 198.51.200.255 198.51.100.1 ping_test_from $h2 198.51.100.255 198.51.100.1 ping_test_from $h2 255.255.255.255 198.51.100.1 bc_forwarding_restore bc_forwarding_enable ping_test_from $h1 198.51.100.255 198.51.100.2 ping_test_from $h1 198.51.200.255 198.51.200.2 ping_test_from $h1 192.0.2.255 192.0.2.1 1 ping_test_from $h1 255.255.255.255 192.0.2.1 ping_test_from $h2 192.0.2.255 192.0.2.2 ping_test_from $h2 198.51.200.255 198.51.200.2 ping_test_from $h2 198.51.100.255 198.51.100.1 1 ping_test_from $h2 255.255.255.255 198.51.100.1 bc_forwarding_restore sysctl_restore net.ipv4.icmp_echo_ignore_broadcasts } > Great, with your fix, I can extend more for this selftest. > but I hope no side effects would be caused. > > Thank you.