Public bug reported: SRU Justification:
[Impact] In Emerald Rapids VMs, a stack trace is printed during boot on 6.8 based kernels, but not 6.5 ones. The stack traces look like: [ 1.206658] intel_pstate: Intel P-state driver initializing [ 1.207453] unchecked MSR access error: WRMSR to 0x199 (tried to write 0x0000000000000800) at rIP: 0xffffffffb94c3b24 (native_write_msr+0x4/0x40) [ 1.208422] Call Trace: [ 1.208422] <TASK> [ 1.208422] ? show_stack_regs+0x23/0x40 [ 1.208422] ? ex_handler_msr+0x10a/0x180 [ 1.208422] ? fixup_exception+0x183/0x390 [ 1.208422] ? gp_try_fixup_and_notify+0x23/0xc0 [ 1.208422] ? exc_general_protection+0x15e/0x480 [ 1.208422] ? asm_exc_general_protection+0x27/0x30 [ 1.208422] ? __pfx___wrmsr_on_cpu+0x10/0x10 [ 1.208422] ? native_write_msr+0x4/0x40 [ 1.208422] ? __wrmsr_on_cpu+0x4b/0x90 [ 1.208422] ? __pfx___rdmsr_on_cpu+0x10/0x10 [ 1.208422] ? __pfx___wrmsr_on_cpu+0x10/0x10 [ 1.208422] generic_exec_single+0x7e/0x120 [ 1.208422] smp_call_function_single+0x103/0x140 [ 1.208422] ? __pfx___wrmsr_on_cpu+0x10/0x10 [ 1.208422] wrmsrl_on_cpu+0x57/0x80 [ 1.208422] intel_pstate_set_pstate+0x3e/0x80 [ 1.208422] intel_pstate_get_cpu_pstates.constprop.0+0xd7/0x190 [ 1.208422] intel_pstate_init_cpu+0x3f/0x140 [ 1.208422] intel_cpufreq_cpu_init+0x44/0x270 [ 1.208422] ? freq_qos_add_notifier+0x45/0x80 [ 1.208422] cpufreq_online+0x444/0xb80 [ 1.208422] cpufreq_add_dev+0x99/0xd0 [ 1.208422] subsys_interface_register+0x11c/0x140 [ 1.208422] cpufreq_register_driver+0x1b5/0x330 [ 1.208422] intel_pstate_register_driver+0x48/0xd0 [ 1.208422] intel_pstate_init+0x25c/0x810 [ 1.208422] ? __pfx_intel_pstate_init+0x10/0x10 [ 1.208422] do_one_initcall+0x5b/0x310 [ 1.208422] do_initcalls+0x104/0x210 [ 1.208422] ? __pfx_kernel_init+0x10/0x10 [ 1.208422] kernel_init_freeable+0x134/0x1f0 [ 1.208422] kernel_init+0x1b/0x200 [ 1.208422] ret_from_fork+0x44/0x70 [ 1.208422] ? __pfx_kernel_init+0x10/0x10 [ 1.208422] ret_from_fork_asm+0x1b/0x30 [ 1.208422] </TASK> [Fix] The reason that these traces are not found in the 6.5 kernel, but are in the 6.8 kernel one had the traces is that in the 6.5 kernel the CPU in the instance (Emerald Rapids) does not have the pstate support added yet. The 6.8 kernel does have support though. That is why they don't appear in the 6.5 kernel, but in the 6.8 kernel. In the 6.11 kernel the patch https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e1c3f584ee78b0d0210fc424420d9529f3ca952 was added to prevent trying to set the p-states while in OOB (Out-of- band) mode. So, rather than having the VM trying to set the p-states, it is handled by the host OS/platform. [Test Plan] I have tested this. [Where problems could occur] In the 6.11 kernel tree where this change comes from, there has been some significant refactoring of the code. If other changes need to be back-ported to the AWS 6.8 kernel, other conflicts or issues may arise. [Other] SF #00395865 ** Affects: linux-aws (Ubuntu) Importance: Undecided Assignee: Philip Cox (philcox) Status: Fix Released ** Affects: linux-aws (Ubuntu Noble) Importance: High Assignee: Philip Cox (philcox) Status: In Progress ** Also affects: linux-aws (Ubuntu Noble) Importance: Undecided Status: New ** Changed in: linux-aws (Ubuntu) Status: New => Fix Released ** Changed in: linux-aws (Ubuntu Noble) Assignee: (unassigned) => Philip Cox (philcox) ** Changed in: linux-aws (Ubuntu Noble) Status: New => Confirmed ** Changed in: linux-aws (Ubuntu Noble) Status: Confirmed => Triaged ** Changed in: linux-aws (Ubuntu Noble) Status: Triaged => In Progress ** Changed in: linux-aws (Ubuntu Noble) Importance: Undecided => High -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux-aws in Ubuntu. https://bugs.launchpad.net/bugs/2080569 Title: AWS: support Out-Of-Band pstate mode for Emerald Rapids Status in linux-aws package in Ubuntu: Fix Released Status in linux-aws source package in Noble: In Progress Bug description: SRU Justification: [Impact] In Emerald Rapids VMs, a stack trace is printed during boot on 6.8 based kernels, but not 6.5 ones. The stack traces look like: [ 1.206658] intel_pstate: Intel P-state driver initializing [ 1.207453] unchecked MSR access error: WRMSR to 0x199 (tried to write 0x0000000000000800) at rIP: 0xffffffffb94c3b24 (native_write_msr+0x4/0x40) [ 1.208422] Call Trace: [ 1.208422] <TASK> [ 1.208422] ? show_stack_regs+0x23/0x40 [ 1.208422] ? ex_handler_msr+0x10a/0x180 [ 1.208422] ? fixup_exception+0x183/0x390 [ 1.208422] ? gp_try_fixup_and_notify+0x23/0xc0 [ 1.208422] ? exc_general_protection+0x15e/0x480 [ 1.208422] ? asm_exc_general_protection+0x27/0x30 [ 1.208422] ? __pfx___wrmsr_on_cpu+0x10/0x10 [ 1.208422] ? native_write_msr+0x4/0x40 [ 1.208422] ? __wrmsr_on_cpu+0x4b/0x90 [ 1.208422] ? __pfx___rdmsr_on_cpu+0x10/0x10 [ 1.208422] ? __pfx___wrmsr_on_cpu+0x10/0x10 [ 1.208422] generic_exec_single+0x7e/0x120 [ 1.208422] smp_call_function_single+0x103/0x140 [ 1.208422] ? __pfx___wrmsr_on_cpu+0x10/0x10 [ 1.208422] wrmsrl_on_cpu+0x57/0x80 [ 1.208422] intel_pstate_set_pstate+0x3e/0x80 [ 1.208422] intel_pstate_get_cpu_pstates.constprop.0+0xd7/0x190 [ 1.208422] intel_pstate_init_cpu+0x3f/0x140 [ 1.208422] intel_cpufreq_cpu_init+0x44/0x270 [ 1.208422] ? freq_qos_add_notifier+0x45/0x80 [ 1.208422] cpufreq_online+0x444/0xb80 [ 1.208422] cpufreq_add_dev+0x99/0xd0 [ 1.208422] subsys_interface_register+0x11c/0x140 [ 1.208422] cpufreq_register_driver+0x1b5/0x330 [ 1.208422] intel_pstate_register_driver+0x48/0xd0 [ 1.208422] intel_pstate_init+0x25c/0x810 [ 1.208422] ? __pfx_intel_pstate_init+0x10/0x10 [ 1.208422] do_one_initcall+0x5b/0x310 [ 1.208422] do_initcalls+0x104/0x210 [ 1.208422] ? __pfx_kernel_init+0x10/0x10 [ 1.208422] kernel_init_freeable+0x134/0x1f0 [ 1.208422] kernel_init+0x1b/0x200 [ 1.208422] ret_from_fork+0x44/0x70 [ 1.208422] ? __pfx_kernel_init+0x10/0x10 [ 1.208422] ret_from_fork_asm+0x1b/0x30 [ 1.208422] </TASK> [Fix] The reason that these traces are not found in the 6.5 kernel, but are in the 6.8 kernel one had the traces is that in the 6.5 kernel the CPU in the instance (Emerald Rapids) does not have the pstate support added yet. The 6.8 kernel does have support though. That is why they don't appear in the 6.5 kernel, but in the 6.8 kernel. In the 6.11 kernel the patch https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e1c3f584ee78b0d0210fc424420d9529f3ca952 was added to prevent trying to set the p-states while in OOB (Out-of- band) mode. So, rather than having the VM trying to set the p-states, it is handled by the host OS/platform. [Test Plan] I have tested this. [Where problems could occur] In the 6.11 kernel tree where this change comes from, there has been some significant refactoring of the code. If other changes need to be back-ported to the AWS 6.8 kernel, other conflicts or issues may arise. [Other] SF #00395865 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2080569/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp