Cc: netdev On 11/13/20 3:28 PM, Eelco Chaudron wrote: > > > On 13 Nov 2020, at 13:06, Ilya Maximets wrote: > >> On 11/13/20 11:04 AM, Eelco Chaudron wrote: >>> Add support for the dec_ttl action. Instead of programming the datapath with >>> a flow that matches the packet TTL and an IP set, use a single dec_ttl >>> action. >>> >>> The old behavior is kept if the new action is not supported by the datapath. >>> >>> # ovs-ofctl dump-flows br0 >>> cookie=0x0, duration=12.538s, table=0, n_packets=4, n_bytes=392, ip >>> actions=dec_ttl,NORMAL >>> cookie=0x0, duration=12.536s, table=0, n_packets=4, n_bytes=168, >>> actions=NORMAL >>> >>> # ping -c1 -t 20 192.168.0.2 >>> PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data. >>> IP (tos 0x0, ttl 19, id 45336, offset 0, flags [DF], proto ICMP (1), >>> length 84) >>> 192.168.0.1 > 192.168.0.2: ICMP echo request, id 8865, seq 1, length >>> 64 >>> >>> Linux netlink datapath support depends on upstream Linux commit: >>> 744676e77720 ("openvswitch: add TTL decrement action") >>> >>> >>> Note that in the Linux kernel tree the OVS_ACTION_ATTR_ADD_MPLS has been >>> defined, and to make sure the IDs are in sync, it had to be added to the >>> OVS source tree. This required some additional case statements, which >>> should be revisited once the OVS implementation is added. >>> >>> >>> Co-developed-by: Matteo Croce <mcr...@linux.microsoft.com> >>> Co-developed-by: Bindiya Kurle <bindiyaku...@gmail.com> >>> Signed-off-by: Eelco Chaudron <echau...@redhat.com> >>> >>> --- >>> v2: - Used definition instead of numeric value in format_dec_ttl_action() >>> - Changed format from "dec_ttl(ttl<=1(<actions>)) to >>> "dec_ttl(le_1(<actions>))" to be more in line with the check_pkt_len >>> action. >>> - Fixed parsing of "dec_ttl()" action for adding a dp flow. >>> - Cleaned up format_dec_ttl_action() >>> >>> datapath/linux/compat/include/linux/openvswitch.h | 8 ++ >>> lib/dpif-netdev.c | 4 + >>> lib/dpif.c | 4 + >>> lib/odp-execute.c | 102 >>> ++++++++++++++++++++- >>> lib/odp-execute.h | 2 >>> lib/odp-util.c | 42 +++++++++ >>> lib/packets.h | 13 ++- >>> ofproto/ofproto-dpif-ipfix.c | 2 >>> ofproto/ofproto-dpif-sflow.c | 2 >>> ofproto/ofproto-dpif-xlate.c | 54 +++++++++-- >>> ofproto/ofproto-dpif.c | 37 ++++++++ >>> ofproto/ofproto-dpif.h | 6 + >>> 12 files changed, 253 insertions(+), 23 deletions(-) >>> >> >> <snip> >> >>> +static void >>> +format_dec_ttl_action(struct ds *ds,const struct nlattr *attr, >>> + const struct hmap *portno_names) >>> +{ >>> + const struct nlattr *nla_acts = nl_attr_get(attr); >>> + int len = nl_attr_get_size(attr); >>> + >>> + ds_put_cstr(ds,"dec_ttl(le_1("); >>> + >>> + if (len > 4 && nla_acts->nla_type == OVS_DEC_TTL_ATTR_ACTION) { >>> + /* Linux kernel add an additional envelope we should strip. */ >>> + len -= nl_attr_len_pad(nla_acts, len); >>> + nla_acts = nl_attr_next(nla_acts); >> >> CC: Pravin >> >> I looked at the kernel and I agree that there is a clear bug in kernel's >> implementaion of this action. It receives messages on format: >> OVS_ACTION_ATTR_DEC_TTL(<list of actions>), >> but reports back in format: >> OVS_ACTION_ATTR_DEC_TTL(OVS_DEC_TTL_ATTR_ACTION(<list of actions>)). >> >> Since 'OVS_DEC_TTL_ATTR_ACTION' exists, it's clear that original design >> was to have it, i.e. the correct format should be the form that >> kernel reports back to userspace. I'd guess that there was a plan to >> add more features to OVS_ACTION_ATTR_DEC_TTL in the future, e.g. set >> actions execution threshold to something different than 1, so it make >> some sense. >> >> Anyway, the bug is in the kernel part of parsing the netlink message and >> it should be fixed. > > It is already in the mainline kernel, so changing it now would break the UAPI. > Don't think this is allowed from the kernel side.
Well, UAPI is what specified in include/uapi/linux/openvswitch.h. And it says: OVS_ACTION_ATTR_DEC_TTL, /* Nested OVS_DEC_TTL_ATTR_*. */ So, the action must have nested OVS_DEC_TTL_ATTR_ACTION, otherwise it's malformed. This means that UAPI is broken now in terms that kernel doesn't respect it's own UAPI. And that's a bug that should be fixed. > >> What I'm suggesting is to send a bugfix to kernel >> to accept only format with nested OVS_DEC_TTL_ATTR_ACTION. Since this >> feature was never implemented in userspace OVS, this change will not >> break anything. On the OVS side we should always format netlink messages >> in a correct way. We have a code that checks feature existence in kernel >> and it should fail if kernel is broken (as it is right now). In this case >> OVS will continue to use old implementation with setting the ttl field. >> >> Since the patch for a kernel is a bug fix, it should be likely backported >> to stable versions, and distributions will pick it up too. > > If the kernel UAPI breakage is not a problem, I think this would work. > >> Thoughts? >> >> Best regards, Ilya Maximets. >