{mpellizzer, joseogando} Speaking of verification, the issue only get appears after a significant amount of uptime (i.e. the kernels started crashing months after the initial deployment).
The BF3s that were crashing did not immediately crash after power cycling despite the same traffic flows that could generate the same TC filters. Reproducing it has a significant timing aspect involved and we cannot just trigger it at a random point in time. So, w.r.t. the above bot message, in all honesty, I cannot verify it in 5 working days because there isn't a set of actions identified yet to trigger a 100% reproducer. The best I can do is run scenarios with the built kernel 5.15.0-1069.71 on one of the machines I have access to which involve creating a lot tc filters and just make sure nothing breaks in normal conditions. And once there is a singed build that can be rolled out onto SecureBoot- enabled systems, I can roll it out on a large estate. Would that work for you? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-bluefield in Ubuntu. https://bugs.launchpad.net/bugs/2109993 Title: linux-bluefield is vulnerable to CVE-2025-21857 Status in linux-bluefield package in Ubuntu: Confirmed Status in linux-bluefield source package in Jammy: In Progress Status in linux-bluefield source package in Noble: In Progress Bug description: [ Impact ] net/sched: cls_api: fix error handling causing NULL dereference tcf_exts_miss_cookie_base_alloc() calls xa_alloc_cyclic() which can return 1 if the allocation succeeded after wrapping. This was treated as an error, with value 1 returned to caller tcf_exts_init_ex() which sets exts->actions to NULL and returns 1 to caller fl_change(). fl_change() treats err == 1 as success, calling tcf_exts_validate_ex() which calls tcf_action_init() with exts->actions as argument, where it is dereferenced. [ Fix ] Cherry pick the fix commit from mainline: - 071ed42cff4f net/sched: cls_api: fix error handling causing NULL dereference [ Test Plan ] Compile tested. [ Where Problems Could Occur ] A regression here is unlikely due to the very limited scope of the patch. --- Currently linux-bluefield is vulnerable to https://ubuntu.com/security/CVE-2025-21857. I encountered instances of this on several hundred BF3 cards that crashed over time with a null pointer dereference causing outages. The latest Bluefield image builds are affected https://github.com/Mellanox/bfb- build/blob/9e80eb358e7bb9e62328039745cc43d69eefc64a/ubuntu/22.04/Dockerfile#L33-L46 (bf-bundle-2.10.0-147_25.01_ubuntu-22.04) The unpatched function in linux-bluefield: https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-bluefield/+git/jammy/tree/net/sched/cls_api.c?h=master-next#n99 static int tcf_exts_miss_cookie_base_alloc(struct tcf_exts *exts, struct tcf_proto *tp, u32 handle) { // ... if (err) goto err_xa_alloc; The upstream one-liner: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=3c74b5787caf59bb1e9c5fe0a360643a71eb1e8a diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 8e47e5355be613..4f648af8cfaafe 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -97,7 +97,7 @@ tcf_exts_miss_cookie_base_alloc(struct tcf_exts *exts, struct tcf_proto *tp, err = xa_alloc_cyclic(&tcf_exts_miss_cookies_xa, &n->miss_cookie_base, n, xa_limit_32b, &next, GFP_KERNEL); - if (err) + if (err < 0) goto err_xa_alloc; To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-bluefield/+bug/2109993/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp