On 7/26/2019 5:01 PM, Marcelo Ricardo Leitner wrote: > On Fri, Jul 26, 2019 at 08:39:43PM +0800, wenxu wrote: >> >> 在 2019/7/26 20:19, Or Gerlitz 写道: >>> On Fri, Jul 26, 2019 at 12:24 AM Saeed Mahameed <sae...@mellanox.com> wrote: >>>> On Thu, 2019-07-25 at 19:24 +0800, we...@ucloud.cn wrote: >>>>> From: wenxu <we...@ucloud.cn> >>>>> >>>>> The flow_cls_common_offload prio is zero >>>>> >>>>> It leads the invalid table prio in hw. >>>>> >>>>> Error: Could not process rule: Invalid argument >>>>> >>>>> kernel log: >>>>> mlx5_core 0000:81:00.0: E-Switch: Failed to create FDB Table err -22 >>>>> (table prio: 65535, level: 0, size: 4194304) >>>>> >>>>> table_prio = (chain * FDB_MAX_PRIO) + prio - 1; >>>>> should check (chain * FDB_MAX_PRIO) + prio is not 0 >>>>> >>>>> Signed-off-by: wenxu <we...@ucloud.cn> >>>>> --- >>>>> drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 4 +++- >>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git >>>>> a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c >>>>> b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c >>>>> index 089ae4d..64ca90f 100644 >>>>> --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c >>>>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c >>>>> @@ -970,7 +970,9 @@ static int esw_add_fdb_miss_rule(struct >>>> this piece of code isn't in this function, weird how it got to the >>>> diff, patch applies correctly though ! >>>> >>>>> mlx5_eswitch *esw) >>>>> flags |= (MLX5_FLOW_TABLE_TUNNEL_EN_REFORMAT | >>>>> MLX5_FLOW_TABLE_TUNNEL_EN_DECAP); >>>>> >>>>> - table_prio = (chain * FDB_MAX_PRIO) + prio - 1; >>>>> + table_prio = (chain * FDB_MAX_PRIO) + prio; >>>>> + if (table_prio) >>>>> + table_prio = table_prio - 1; >>>>> >>>> This is black magic, even before this fix. >>>> this -1 seems to be needed in order to call >>>> create_next_size_table(table_prio) with the previous "table prio" ? >>>> (table_prio - 1) ? >>>> >>>> The whole thing looks wrong to me since when prio is 0 and chain is 0, >>>> there is not such thing table_prio - 1. >>>> >>>> mlnx eswitch guys in the cc, please advise. >>> basically, prio 0 is not something we ever get in the driver, since if >>> user space >>> specifies 0, the kernel generates some random non-zero prio, and we support >>> only prios 1-16 -- Wenxu -- what do you run to get this error? >>> >>> >> I run offload with nfatbles(but not tc), there is no prio for each rule. >> >> prio of flow_cls_common_offload init as 0. >> >> static void nft_flow_offload_common_init(struct flow_cls_common_offload >> *common, >> >> __be16 proto, >> struct netlink_ext_ack *extack) >> { >> common->protocol = proto; >> common->extack = extack; >> } >> >> >> flow_cls_common_offload > > Note that on > [PATCH net-next] netfilter: nf_table_offload: Fix zero prio of > flow_cls_common_offload > I asked Pablo on how nftables should behave on this situation. > > It's the same issue as in the patch above but being fixed at a > different level.
That's better, since the original code relied on not having prio 0 as valid, the suggested fix (net/mlx5e: Fix zero table prio set by user) maps NFT offload prio 0 and tc prio 1 to the same hardware table. This is wrong and can cause issues.