Public bug reported: We've discovered an issue on Ubuntu 20.04 when used with Kubernetes CNIs that perform offloading using Geneve that causes the kernel to panic on Azure instances with accelerated networking with the following errors:
[ 307.561223] mlx5_core 0001:00:02.0 enP1s1: Error cqe on cqn 0x200, ci 0x3d4, sqn 0x2c5, opcode 0xd, syndrome 0x2, vendor syndrome 0x68 [ 307.573864] mlx5_core 0001:00:02.0 enP1s1: ERR CQE on SQ: 0x2c5 [ 307.764902] mlx5_core 0001:00:02.0 enP1s1: Error cqe on cqn 0x200, ci 0x3d7, sqn 0x2c5, opcode 0xd, syndrome 0x2, vendor syndrome 0x68 [ 307.777332] mlx5_core 0001:00:02.0 enP1s1: ERR CQE on SQ: 0x2c5 [ 322.814393] mlx5_core 0001:00:02.0 enP1s1: Error cqe on cqn 0x218, ci 0x1a7, sqn 0x2bd, opcode 0xd, syndrome 0x2, vendor syndrome 0x68 [ 322.826685] mlx5_core 0001:00:02.0 enP1s1: ERR CQE on SQ: 0x2bd NVIDIA fixed this issue in https://github.com/torvalds/linux/commit/5ccc0ecda9e8a67add654d93d7e0ac4346c0fa22 , so we're looking to have this backported to at least the linux-azure package. ** Affects: linux-azure (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1921769 Title: Backport mlx5e fix for tunnel offload To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1921769/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs