On 6/6/2019 4:42 AM, Cong Wang wrote: > On Tue, Jun 4, 2019 at 11:19 AM Eli Britstein <el...@mellanox.com> wrote: >> >> On 6/4/2019 8:55 PM, Cong Wang wrote: >>> On Sat, Jun 1, 2019 at 9:22 PM Eli Britstein <el...@mellanox.com> wrote: >>>> I think that's because QinQ, or VLAN is not an encapsulation. There is >>>> no outer/inner packets, and if you want to mangle fields in the packet >>>> you can do it and the result is well-defined. >>> Sort of, perhaps VLAN tags are too short to be called as an >>> encapsulation, my point is that it still needs some endpoints to push >>> or pop the tags, in a similar way we do encap/decap. >>> >>> >>>> BTW, the motivation for my fix was a use case were 2 VGT VMs >>>> communicating by OVS failed. Since OVS sees the same VLAN tag, it >>>> doesn't add explicit VLAN pop/push actions (i.e pop, mangle, push). If >>>> you force explicit pop/mangle/push you will break such applications. >>> From what you said, it seems act_csum is in the middle of packet >>> receive/transmit path. So, which is the one pops the VLAN tags in >>> this scenario? If the VM's are the endpoints, why not use act_csum >>> there? >> In a switchdev mode, we can passthru the VFs to VMs, and have their >> representors in the host, enabling us to manipulate the HW eswitch >> without knowledge of the VMs. >> >> To simplify it, consider the following setup: >> >> v1a <-> v1b and v2a <-> v2b are veth pairs. >> >> Now, we configure v1a.20 and v2a.20 as VLAN devices over v1a/v2a >> respectively (and put the "a" devs in separate namespaces). >> >> The TC rules are on the "b" devs, for example: >> >> tc filter add dev v1b ... action pedit ... action csum ... action >> redirect dev v2b >> >> Now, ping from v1a.20 to v1b.20. The namespaces transmit/receive tagged >> packets, and are not aware of the packet manipulation (and the required >> act_csum). > This is what I said, v1b is not the endpoint which pops the vlan tag, > v1b.20 is. So, why not simply move at least the csum action to > v1b.20? With that, you can still filter and redirect packets on v1b, > you still even modify it too, just defer the checksum fixup to the > endpoint.
There are no vxb.20 ports: ns0: v1a.20 ----(VLAN)---- v1a ns1: v2a ---- (VLAN) ---- v2a.20 |----(veth)---- v1b <---- (TC) ----> v2b ----(veth)----| > > And to be fair, if this case is a valid concern, so is VXLAN case, > just replace v1a.20 and v2a.20 with VXLAN tunnels. If you modify > the inner header, you have to fixup the checksum in the outer > UDP header. > > Thanks.