{mpellizzer, joseogando}

Speaking of verification, the issue only get appears after a significant
amount of uptime (i.e. the kernels started crashing months after the
initial deployment).

The BF3s that were crashing did not immediately crash after power
cycling despite the same traffic flows that could generate the same TC
filters.

Reproducing it has a significant timing aspect involved and we cannot
just trigger it at a random point in time.

So, w.r.t. the above bot message, in all honesty, I cannot verify it in
5 working days because there isn't a set of actions identified yet to
trigger a 100% reproducer.

The best I can do is run scenarios with the built kernel 5.15.0-1069.71
on one of the machines I have access to which involve creating a lot tc
filters and just make sure nothing breaks in normal conditions.

And once there is a singed build that can be rolled out onto SecureBoot-
enabled systems, I can roll it out on a large estate.

Would that work for you?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-bluefield in Ubuntu.
https://bugs.launchpad.net/bugs/2109993

Title:
  linux-bluefield is vulnerable to CVE-2025-21857

Status in linux-bluefield package in Ubuntu:
  Confirmed
Status in linux-bluefield source package in Jammy:
  In Progress
Status in linux-bluefield source package in Noble:
  In Progress

Bug description:
  [ Impact ]

  net/sched: cls_api: fix error handling causing NULL dereference

  tcf_exts_miss_cookie_base_alloc() calls xa_alloc_cyclic() which can
  return 1 if the allocation succeeded after wrapping. This was treated as
  an error, with value 1 returned to caller tcf_exts_init_ex() which sets
  exts->actions to NULL and returns 1 to caller fl_change().

  fl_change() treats err == 1 as success, calling tcf_exts_validate_ex()
  which calls tcf_action_init() with exts->actions as argument, where it
  is dereferenced.

  [ Fix ]

  Cherry pick the fix commit from mainline:
  - 071ed42cff4f net/sched: cls_api: fix error handling causing NULL dereference

  [ Test Plan ]

  Compile tested.

  [ Where Problems Could Occur ]

  A regression here is unlikely due to the very limited scope
  of the patch.

  ---

  Currently linux-bluefield is vulnerable to
  https://ubuntu.com/security/CVE-2025-21857.

  I encountered instances of this on several hundred BF3 cards that
  crashed over time with a null pointer dereference causing outages.

  The latest Bluefield image builds are affected
  https://github.com/Mellanox/bfb-
  
build/blob/9e80eb358e7bb9e62328039745cc43d69eefc64a/ubuntu/22.04/Dockerfile#L33-L46
  (bf-bundle-2.10.0-147_25.01_ubuntu-22.04)

  The unpatched function in linux-bluefield:

  
https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-bluefield/+git/jammy/tree/net/sched/cls_api.c?h=master-next#n99
  static int
  tcf_exts_miss_cookie_base_alloc(struct tcf_exts *exts, struct tcf_proto *tp,
      u32 handle)
  {
          // ...
   if (err)
    goto err_xa_alloc;

  The upstream one-liner:
  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3c74b5787caf59bb1e9c5fe0a360643a71eb1e8a

  diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
  index 8e47e5355be613..4f648af8cfaafe 100644
  --- a/net/sched/cls_api.c
  +++ b/net/sched/cls_api.c
  @@ -97,7 +97,7 @@ tcf_exts_miss_cookie_base_alloc(struct tcf_exts *exts, 
struct tcf_proto *tp,

    err = xa_alloc_cyclic(&tcf_exts_miss_cookies_xa, &n->miss_cookie_base,
            n, xa_limit_32b, &next, GFP_KERNEL);
  -     if (err)
  +     if (err < 0)
     goto err_xa_alloc;

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2109993/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to