On 9/24/20 7:48 AM, Stephen Suryaputra wrote: > On Wed, Sep 23, 2020 at 07:47:16PM -0600, David Ahern wrote: >> If I remove the fib rules and add VRF route leaking from core to tenant >> it works. Why is that not an option? Overlapping tenant addresses? > > Exactly. > >> One thought to get around it is adding support for a new FIB rule type >> -- say l3mdev_port. That rule can look at the real ingress device which >> is saved in the skb->cb as IPCB(skb)->iif. > > OK. Just to ensure that the existing ip rule behavior isn't considered a > bug. > > We have multiple options on the table right now. One that can be done > without writing any code is to use an nft prerouting rule to mark > the packet with iif equals the tunnel and use ip rule fwmark to lookup > the right table. > > ip netns exec r0 nft add table ip c2t > ip netns exec r0 nft add chain ip c2t prerouting '{ type filter hook > prerouting priority 0; policy accept; }' > ip netns exec r0 nft rule ip c2t prerouting iif gre01 mark set 101 counter > ip netns exec r0 ip rule add fwmark 101 table 10 pref 999 > > ip netns exec r1 nft add table ip c2t > ip netns exec r1 nft add chain ip c2t prerouting '{ type filter hook > prerouting priority 0; policy accept; }' > ip netns exec r1 nft rule ip c2t prerouting iif gre10 mark set 101 counter > ip netns exec r1 ip rule add fwmark 101 table 10 pref 999 > > But this doesn't seem to work on my Ubuntu VM with the namespaces > script, i.e. pinging from h0 to h1. The packet doesn't egress r1_v11. It > does work on our target, based on 4.14 kernel.
add debugs to net/core/fib_rules.c, fib_rule_match() to see if flowi_mark is getting set properly. There could easily be places that are missed. Or if it works on one setup, but not another compare sysctl settings for net.core and net.ipv4 > > We also notice though in on the target platform that the ip rule fwmark > doesn't seem to change the skb->dev to the vrf of the lookup table. not following that statement. fwmark should be marking the skb, not changing the skb->dev. > E.g., ping from 10.0.0.1 to 11.0.0.1. With net.ipv4.fwmark_reflect set, > the reply is generated but the originating ping application doesn't get > the packet. I suspect it is caused by the socket is bound to the tenant > vrf. I haven't been able to repro this because of the problem with the > nft approach above. > > Thanks, > Stephen. >