On 05/20/2018 03:58 PM, Mathieu Xhonneux wrote: > The BPF seg6local hook should be powerful enough to enable users to > implement most of the use-cases one could think of. After some thinking, > we figured out that the following actions should be possible on a SRv6 > packet, requiring 3 specific helpers : > - bpf_lwt_seg6_store_bytes: Modify non-sensitive fields of the SRH > - bpf_lwt_seg6_adjust_srh: Allow to grow or shrink a SRH > (to add/delete TLVs) > - bpf_lwt_seg6_action: Apply some SRv6 network programming actions > (specifically End.X, End.T, End.B6 and > End.B6.Encap) > > The specifications of these helpers are provided in the patch (see > include/uapi/linux/bpf.h). > > The non-sensitive fields of the SRH are the following : flags, tag and > TLVs. The other fields can not be modified, to maintain the SRH > integrity. Flags, tag and TLVs can easily be modified as their validity > can be checked afterwards via seg6_validate_srh. It is not allowed to > modify the segments directly. If one wants to add segments on the path, > he should stack a new SRH using the End.B6 action via > bpf_lwt_seg6_action. > > Growing, shrinking or editing TLVs via the helpers will flag the SRH as > invalid, and it will have to be re-validated before re-entering the IPv6 > layer. This flag is stored in a per-CPU buffer, along with the current > header length in bytes. > > Storing the SRH len in bytes in the control block is mandatory when using > bpf_lwt_seg6_adjust_srh. The Header Ext. Length field contains the SRH > len rounded to 8 bytes (a padding TLV can be inserted to ensure the 8-bytes > boundary). When adding/deleting TLVs within the BPF program, the SRH may > temporary be in an invalid state where its length cannot be rounded to 8 > bytes without remainder, hence the need to store the length in bytes > separately. The caller of the BPF program can then ensure that the SRH's > final length is valid using this value. Again, a final SRH modified by a > BPF program which doesn’t respect the 8-bytes boundary will be discarded > as it will be considered as invalid. > > Finally, a fourth helper is provided, bpf_lwt_push_encap, which is > available from the LWT BPF IN hook, but not from the seg6local BPF one. > This helper allows to encapsulate a Segment Routing Header (either with > a new outer IPv6 header, or by inlining it directly in the existing IPv6 > header) into a non-SRv6 packet. This helper is required if we want to > offer the possibility to dynamically encapsulate a SRH for non-SRv6 packet, > as the BPF seg6local hook only works on traffic already containing a SRH. > This is the BPF equivalent of the seg6 LWT infrastructure, which achieves > the same purpose but with a static SRH per route. > > These helpers require CONFIG_IPV6=y (and not =m). > > Signed-off-by: Mathieu Xhonneux <m.xhonn...@gmail.com> > Acked-by: David Lebrun <dleb...@google.com>
One minor comments for follow-ups in here below. > +BPF_CALL_4(bpf_lwt_seg6_store_bytes, struct sk_buff *, skb, u32, offset, > + const void *, from, u32, len) > +{ > +#if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) > + struct seg6_bpf_srh_state *srh_state = > + this_cpu_ptr(&seg6_bpf_srh_states); > + void *srh_tlvs, *srh_end, *ptr; > + struct ipv6_sr_hdr *srh; > + int srhoff = 0; > + > + if (ipv6_find_hdr(skb, &srhoff, IPPROTO_ROUTING, NULL, NULL) < 0) > + return -EINVAL; > + > + srh = (struct ipv6_sr_hdr *)(skb->data + srhoff); > + srh_tlvs = (void *)((char *)srh + ((srh->first_segment + 1) << 4)); > + srh_end = (void *)((char *)srh + sizeof(*srh) + srh_state->hdrlen); > + > + ptr = skb->data + offset; > + if (ptr >= srh_tlvs && ptr + len <= srh_end) > + srh_state->valid = 0; > + else if (ptr < (void *)&srh->flags || > + ptr + len > (void *)&srh->segments) > + return -EFAULT; > + > + if (unlikely(bpf_try_make_writable(skb, offset + len))) > + return -EFAULT; > + > + memcpy(skb->data + offset, from, len); > + return 0; > +#else /* CONFIG_IPV6_SEG6_BPF */ > + return -EOPNOTSUPP; > +#endif > +} Instead of doing this inside the helper you can reject the program already in the lwt_*_func_proto() by returning NULL when !CONFIG_IPV6_SEG6_BPF. That way programs get rejected at verification time instead of runtime, so the user can probe availability more easily.