On 2020-06-26 8:52 a.m., Toke Høiland-Jørgensen wrote:
Davide Caratti <dcara...@redhat.com> writes:

hello,

my 2 cents:

On Thu, 2020-06-25 at 21:53 +0200, Toke Høiland-Jørgensen wrote:
I think it depends a little on the use case; some callers actually care
about the VLAN tags themselves and handle that specially (e.g.,
act_csum).

I remember taht something similar was discussed about 1 year ago [1].

Ah, thank you for the pointer!

Whereas others (e.g., sch_dsmark) probably will have the same
issue.

I'd say that the issue "propagates" to all qdiscs that mangle the ECN-CE
bit (i.e., calling INET_ECN_set_ce() [2]), most notably all the RED
variants and "codel/fq_codel".

Yeah, I think we should fix INET_ECN_set_ce() instead of re-implementing
it in CAKE. See below, though.

  I guess I can trying going through them all and figuring out if
there's a more generic solution.

For sch_cake, I think that the qdisc shouldn't look at the IP header when
it schedules packets having a VLAN tag.

Probably, when tc_skb_protocol() returns ETH_P_8021Q or ETH_P_8021AD, we
should look at the VLAN priority (PCP) bits (and that's something that
cake doesn't do currently - but I have a small patch in my stash that
implements this: please let me know if you are interested in seeing it :)
).

Then, to ensure that the IP precedence is respected, even with different
VLAN tags, users should explicitly configure TC filters that "map" the
DSCP value to a PCP value. This would ensure that configured priority is
respected by the scheduler, and would also be flexible enough to allow
different "mappings".

I think you have this the wrong way around :)

I.e., classifying based on VLAN priority is even more esoteric than
using diffserv markings, so that should not be the default. Making it
the default would also make the behaviour change for the same traffic if
there's a VLAN tag present, which is bound to confuse people. I suppose
it could be an option, but not really sure that's needed, since as you
say you could just implement it with your own TC filters...

Sure, my proposal does not cover the problem of mangling the CE bit
inside VLAN-tagged packets, i.e. if we should understand if qdiscs
should allow it or not.

Hmm, yeah, that's the rub, isn't it? I think this is related to this
commit, which first introduced tc_skb_protocol():

d8b9605d2697 ("net: sched: fix skb->protocol use in case of accelerated vlan 
path")


I didnt quiet follow the discussion - but the patch you are referencing
was to fix an earlier commit which had broken things (we didnt
have the "fixes" tag back then).

That commit at least made the behaviour consistent across
accelerated/non-accelerated VLANs. However, the patch description
asserts that 'tc code .. expects vlan protocol type [in skb->protocol]'.
Looking at the various callers, I'm not actually sure that's true, in
the sense that most of the callers don't handle VLAN ethertypes at all,
but expects to find an IP header. This is the case for at least:

- act_ctinfo
- act_skbedit
- cls_flow
- em_ipset
- em_ipt
- sch_cake
- sch_dsmark

In fact the only caller that explicitly handles a VLAN ethertype seems
to be act_csum; and that handles it in a way that also just skips the
VLAN headers, albeit by skb_pull()'ing the header.


The earlier change broke a few things unfortunately. There was a more
recent discussion with Simon Horman that i cant find now on breakage
with some classifiers in presence of double vlans.
+cc Simon maybe he can find the discussion.

cls_api, em_meta and sch_teql don't explicitly handle it; but returning
the VLAN ethertypes to those does seem to make sense, since they just
pass the value somewhere else.

So my suggestion would be to add a new helper that skips the VLAN tags
and finds the L3 ethertype (i.e., basically cake_skb_proto() as
introduced in this patch), then switch all the callers listed above, as
well as the INET_ECN_set_ce() over to using that. Maybe something like
'skb_l3_protocol()' which could go into skbuff.h itself, so the ECN code
can also find it?

Any objections to this? It's not actually clear to me how the discussion
you quoted above landed; but this will at least make things consistent
across all the different actions/etc.


I didnt follow the original discussion - will try to read in order
to form an opinion.

cheers,
jamal

Reply via email to