This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed- zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.
If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you! ** Tags added: verification-needed-zesty -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1690914 Title: [Regression] NUMA_BALANCING disabled on arm64 Status in linux package in Ubuntu: Fix Committed Status in linux source package in Zesty: Fix Committed Bug description: [Impact] CONFIG_NUMA_BALANCING and CONFIG_NUMA_BALANCING_DEFAULT_ENABLED were both set to =y in hwe-x/hwe-y. This changed to =n in hwe-z, unintentionally as far as I can tell. This can lead to performance degradation on NUMA-based arm64 systems when threads migrate, and their memory accesses now suffer additional latency. [Test Case] At a functional level: $ test -f /proc/sys/kernel/numabalancing Performance: $ perf bench numa -a I didn't see any significant changes in the RAM-bw tests (expected). For the convergence tests, I observed the following results, which appear to be all within reasonable variance. Test | Balancing=n | Balancing=y ------------------------------------- 1x3 | No-Converge | No-Converge 1x4 | No-Converge | 0.576s 1x6 | No-Converge | No-Converge 2x3 | No-Converge | No-Converge 3x3 | No-Converge | No-Converge 4x4 | No-Converge | No-Converge 4x4-NOTHP| No-Converge | No-Converge 4x6 | No-Converge | No-Converge 4x8 | No-Converge | No-Converge 8x4 | No-Converge | No-Converge 8x4-NOTHP| No-Converge | No-Converge 3x1 | 0.848s | 1.212s 4x1 | 0.832s | 0.712s 8x1 | 0.792s | 0.649s 16x1 | 1.511s | 1.485s 32x1 | 0.750s | 0.899s Finally, for the bw tests, I see significant improvements across the board: Test | BW Improvement ------------------------- ======= Process ========= 2x1 | 2.2% 3x1 | 61.4% 4x1 | 25.0% 8x1 | 104.6% 8x1-NOTHP | 107.6% 16x1 | 200.9% ======= Thread ========== 4x1 | 10.9% 8x1 | 107.4% 16x1 | 230.7% 32x1 | 239.7% 2x3 | 13.5% 4x4 | 69.2% 4x6 | 84.4% 4x8 | 79.7% 4x8-NOTHP | 152.5% 3x3 | 96.1% 5x5 | 150.2% 2x16 | 122.6% 1x32 | 40.5% [Regression Risk] This is changing a config only on arm64, so the regression risk will be limited to those platforms. The code we will be enabling on arm64 is already enabled on other architectures (!s390x), so has been tested within Ubuntu zesty already. This was previous also enabled on arm64 in hwe-x/hwe-y, so we can gain some confidence from that. There is certainly a possibility that this negatively impacts performance for certain workloads on NUMA/arm64 systems. If that occurs, there is a sysctl that can be used to disable this feature. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1690914/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp