On 02/13, Alexei Starovoitov wrote:
> On Wed, Feb 13, 2019 at 09:57:25PM -0800, Stanislav Fomichev wrote:
> > 
> > > That 'stuck with __sk_buff' is what bothers me.
> > I might have use the wrong word here. I don't think there is another
> > option to be honest. Using __sk_buff makes flow dissector programs work
> > with fragmented packets;
> 
> good point. indeed real skb is essential.
> 
> > > It's an indication that api wasn't thought through if first thing
> > > it needs is this fake skb hack.
> > > If bpf_flow.c is a realistic example of such flow dissector prog
> > > it means that real skb fields are accessed.
> > > In particular skb->vlan_proto, skb->protocol.
> > I do manually set skb->protocol to eth->h_proto in my proposal. This is 
> > later
> > correctly handled by bpf_flow.c: parse_eth_proto() is called on 
> > skb->protocol
> > and we correctly handle bpf_htons(ETH_P_8021Q) there. So existing
> > bpf_flow.c works as expected.
> ...
> > The goal of this patch series was to essentially make this skb/no-skb
> > context transparent to the bpf_flow.c (i.e. no changes from the user
> > flow programs). Adding another flow dissector for eth_get_headlen case
> > also seems as a no go.
> 
> The problem with this thinking is assumption that bpf_flow.c is the only 
> program.
I agree, it's a bad assumption, but it is sort of a reference implementation,
I don't expect other users to do something wildly different. Hopefully :-)

> Since ctx of flow_dissector prog type is 'struct __sk_buff'
> all fields should be valid or the verifier has to reject access
> to fields that were not set.
> You cannot "manually set skb->protocol to eth->h_proto" in fake skb
> and ignore the rest.
Ugh, I did expect that we only allow a minimal set of __sk_buff fields
to be allowed from the flow dissector program type, but that's not the
case. We explicitly prohibit access only to
family/ips/ports/tc_classid/tstamp/wire_len, everything else is readable :-/
Any idea why?
Stuff like ingress_ifindex/ifindex/hash/mark/queue_mapping, does flow dissector
programs really need to know that?

For the most part, using zero-initialized fake skb looks fine, except:
* infindex, where we do skb->dev->ifndex (skb->dev is NULL)
* gso_segs, where we do skb_shinfo(skb)->gso_segs (we are missing
  shinfo)

So there is indeed a couple of problems.

How do you feel about tightening down the access to sk_buff fields from
the flow dissector program type? That is an API change, but I don't see why
existing users should use those fields. Let's allow access only to
len/data/data_end, protocol, vlan_{present,tci,proto}, cb, flow_keys,
that should be enough to dissect the packet (I also looked at C-based
implementation, it doesn't use anything besides that).
We can always rollback if somebody complains about.

Reply via email to