On 02/13, Alexei Starovoitov wrote: > On Wed, Feb 13, 2019 at 09:57:25PM -0800, Stanislav Fomichev wrote: > > > > > That 'stuck with __sk_buff' is what bothers me. > > I might have use the wrong word here. I don't think there is another > > option to be honest. Using __sk_buff makes flow dissector programs work > > with fragmented packets; > > good point. indeed real skb is essential. > > > > It's an indication that api wasn't thought through if first thing > > > it needs is this fake skb hack. > > > If bpf_flow.c is a realistic example of such flow dissector prog > > > it means that real skb fields are accessed. > > > In particular skb->vlan_proto, skb->protocol. > > I do manually set skb->protocol to eth->h_proto in my proposal. This is > > later > > correctly handled by bpf_flow.c: parse_eth_proto() is called on > > skb->protocol > > and we correctly handle bpf_htons(ETH_P_8021Q) there. So existing > > bpf_flow.c works as expected. > ... > > The goal of this patch series was to essentially make this skb/no-skb > > context transparent to the bpf_flow.c (i.e. no changes from the user > > flow programs). Adding another flow dissector for eth_get_headlen case > > also seems as a no go. > > The problem with this thinking is assumption that bpf_flow.c is the only > program. I agree, it's a bad assumption, but it is sort of a reference implementation, I don't expect other users to do something wildly different. Hopefully :-)
> Since ctx of flow_dissector prog type is 'struct __sk_buff' > all fields should be valid or the verifier has to reject access > to fields that were not set. > You cannot "manually set skb->protocol to eth->h_proto" in fake skb > and ignore the rest. Ugh, I did expect that we only allow a minimal set of __sk_buff fields to be allowed from the flow dissector program type, but that's not the case. We explicitly prohibit access only to family/ips/ports/tc_classid/tstamp/wire_len, everything else is readable :-/ Any idea why? Stuff like ingress_ifindex/ifindex/hash/mark/queue_mapping, does flow dissector programs really need to know that? For the most part, using zero-initialized fake skb looks fine, except: * infindex, where we do skb->dev->ifndex (skb->dev is NULL) * gso_segs, where we do skb_shinfo(skb)->gso_segs (we are missing shinfo) So there is indeed a couple of problems. How do you feel about tightening down the access to sk_buff fields from the flow dissector program type? That is an API change, but I don't see why existing users should use those fields. Let's allow access only to len/data/data_end, protocol, vlan_{present,tci,proto}, cb, flow_keys, that should be enough to dissect the packet (I also looked at C-based implementation, it doesn't use anything besides that). We can always rollback if somebody complains about.