Public bug reported:

SRU Justification:

[Impact]

In Emerald Rapids VMs, a stack trace is printed during boot on 6.8 based
kernels, but not 6.5 ones.  The stack traces look like:

[    1.206658] intel_pstate: Intel P-state driver initializing
[    1.207453] unchecked MSR access error: WRMSR to 0x199 (tried to write 
0x0000000000000800) at rIP: 0xffffffffb94c3b24 (native_write_msr+0x4/0x40)
[    1.208422] Call Trace:
[    1.208422]  <TASK>
[    1.208422]  ? show_stack_regs+0x23/0x40
[    1.208422]  ? ex_handler_msr+0x10a/0x180
[    1.208422]  ? fixup_exception+0x183/0x390
[    1.208422]  ? gp_try_fixup_and_notify+0x23/0xc0
[    1.208422]  ? exc_general_protection+0x15e/0x480
[    1.208422]  ? asm_exc_general_protection+0x27/0x30
[    1.208422]  ? __pfx___wrmsr_on_cpu+0x10/0x10
[    1.208422]  ? native_write_msr+0x4/0x40
[    1.208422]  ? __wrmsr_on_cpu+0x4b/0x90
[    1.208422]  ? __pfx___rdmsr_on_cpu+0x10/0x10
[    1.208422]  ? __pfx___wrmsr_on_cpu+0x10/0x10
[    1.208422]  generic_exec_single+0x7e/0x120
[    1.208422]  smp_call_function_single+0x103/0x140
[    1.208422]  ? __pfx___wrmsr_on_cpu+0x10/0x10
[    1.208422]  wrmsrl_on_cpu+0x57/0x80
[    1.208422]  intel_pstate_set_pstate+0x3e/0x80
[    1.208422]  intel_pstate_get_cpu_pstates.constprop.0+0xd7/0x190
[    1.208422]  intel_pstate_init_cpu+0x3f/0x140
[    1.208422]  intel_cpufreq_cpu_init+0x44/0x270
[    1.208422]  ? freq_qos_add_notifier+0x45/0x80
[    1.208422]  cpufreq_online+0x444/0xb80
[    1.208422]  cpufreq_add_dev+0x99/0xd0
[    1.208422]  subsys_interface_register+0x11c/0x140
[    1.208422]  cpufreq_register_driver+0x1b5/0x330
[    1.208422]  intel_pstate_register_driver+0x48/0xd0
[    1.208422]  intel_pstate_init+0x25c/0x810
[    1.208422]  ? __pfx_intel_pstate_init+0x10/0x10
[    1.208422]  do_one_initcall+0x5b/0x310
[    1.208422]  do_initcalls+0x104/0x210
[    1.208422]  ? __pfx_kernel_init+0x10/0x10
[    1.208422]  kernel_init_freeable+0x134/0x1f0
[    1.208422]  kernel_init+0x1b/0x200
[    1.208422]  ret_from_fork+0x44/0x70
[    1.208422]  ? __pfx_kernel_init+0x10/0x10
[    1.208422]  ret_from_fork_asm+0x1b/0x30
[    1.208422]  </TASK>

[Fix]

The reason that these traces are not found in the 6.5 kernel, but are in
the 6.8 kernel one had the traces is that in the 6.5 kernel the CPU in
the instance (Emerald Rapids) does not have the pstate support added
yet.  The 6.8 kernel does have support though.  That is why they don't
appear in the 6.5 kernel, but in the 6.8 kernel.

In the 6.11 kernel the patch
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e1c3f584ee78b0d0210fc424420d9529f3ca952
was added to prevent trying to set the p-states while in OOB (Out-of-
band) mode.  So, rather than having the VM trying to set the p-states,
it is handled by the host OS/platform.


[Test Plan]

I have tested this.

[Where problems could occur]
In the 6.11 kernel tree where this change comes from, there has been some 
significant refactoring of the code.  If other changes need to be back-ported 
to the AWS 6.8 kernel, other conflicts or issues may arise.


[Other]
SF #00395865

** Affects: linux-aws (Ubuntu)
     Importance: Undecided
     Assignee: Philip Cox (philcox)
         Status: Fix Released

** Affects: linux-aws (Ubuntu Noble)
     Importance: High
     Assignee: Philip Cox (philcox)
         Status: In Progress

** Also affects: linux-aws (Ubuntu Noble)
   Importance: Undecided
       Status: New

** Changed in: linux-aws (Ubuntu)
       Status: New => Fix Released

** Changed in: linux-aws (Ubuntu Noble)
     Assignee: (unassigned) => Philip Cox (philcox)

** Changed in: linux-aws (Ubuntu Noble)
       Status: New => Confirmed

** Changed in: linux-aws (Ubuntu Noble)
       Status: Confirmed => Triaged

** Changed in: linux-aws (Ubuntu Noble)
       Status: Triaged => In Progress

** Changed in: linux-aws (Ubuntu Noble)
   Importance: Undecided => High

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/2080569

Title:
  AWS: support Out-Of-Band pstate mode for Emerald Rapids

Status in linux-aws package in Ubuntu:
  Fix Released
Status in linux-aws source package in Noble:
  In Progress

Bug description:
  SRU Justification:

  [Impact]

  In Emerald Rapids VMs, a stack trace is printed during boot on 6.8
  based kernels, but not 6.5 ones.  The stack traces look like:

  [    1.206658] intel_pstate: Intel P-state driver initializing
  [    1.207453] unchecked MSR access error: WRMSR to 0x199 (tried to write 
0x0000000000000800) at rIP: 0xffffffffb94c3b24 (native_write_msr+0x4/0x40)
  [    1.208422] Call Trace:
  [    1.208422]  <TASK>
  [    1.208422]  ? show_stack_regs+0x23/0x40
  [    1.208422]  ? ex_handler_msr+0x10a/0x180
  [    1.208422]  ? fixup_exception+0x183/0x390
  [    1.208422]  ? gp_try_fixup_and_notify+0x23/0xc0
  [    1.208422]  ? exc_general_protection+0x15e/0x480
  [    1.208422]  ? asm_exc_general_protection+0x27/0x30
  [    1.208422]  ? __pfx___wrmsr_on_cpu+0x10/0x10
  [    1.208422]  ? native_write_msr+0x4/0x40
  [    1.208422]  ? __wrmsr_on_cpu+0x4b/0x90
  [    1.208422]  ? __pfx___rdmsr_on_cpu+0x10/0x10
  [    1.208422]  ? __pfx___wrmsr_on_cpu+0x10/0x10
  [    1.208422]  generic_exec_single+0x7e/0x120
  [    1.208422]  smp_call_function_single+0x103/0x140
  [    1.208422]  ? __pfx___wrmsr_on_cpu+0x10/0x10
  [    1.208422]  wrmsrl_on_cpu+0x57/0x80
  [    1.208422]  intel_pstate_set_pstate+0x3e/0x80
  [    1.208422]  intel_pstate_get_cpu_pstates.constprop.0+0xd7/0x190
  [    1.208422]  intel_pstate_init_cpu+0x3f/0x140
  [    1.208422]  intel_cpufreq_cpu_init+0x44/0x270
  [    1.208422]  ? freq_qos_add_notifier+0x45/0x80
  [    1.208422]  cpufreq_online+0x444/0xb80
  [    1.208422]  cpufreq_add_dev+0x99/0xd0
  [    1.208422]  subsys_interface_register+0x11c/0x140
  [    1.208422]  cpufreq_register_driver+0x1b5/0x330
  [    1.208422]  intel_pstate_register_driver+0x48/0xd0
  [    1.208422]  intel_pstate_init+0x25c/0x810
  [    1.208422]  ? __pfx_intel_pstate_init+0x10/0x10
  [    1.208422]  do_one_initcall+0x5b/0x310
  [    1.208422]  do_initcalls+0x104/0x210
  [    1.208422]  ? __pfx_kernel_init+0x10/0x10
  [    1.208422]  kernel_init_freeable+0x134/0x1f0
  [    1.208422]  kernel_init+0x1b/0x200
  [    1.208422]  ret_from_fork+0x44/0x70
  [    1.208422]  ? __pfx_kernel_init+0x10/0x10
  [    1.208422]  ret_from_fork_asm+0x1b/0x30
  [    1.208422]  </TASK>

  [Fix]

  The reason that these traces are not found in the 6.5 kernel, but are
  in the 6.8 kernel one had the traces is that in the 6.5 kernel the CPU
  in the instance (Emerald Rapids) does not have the pstate support
  added yet.  The 6.8 kernel does have support though.  That is why they
  don't appear in the 6.5 kernel, but in the 6.8 kernel.

  In the 6.11 kernel the patch
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7e1c3f584ee78b0d0210fc424420d9529f3ca952
  was added to prevent trying to set the p-states while in OOB (Out-of-
  band) mode.  So, rather than having the VM trying to set the p-states,
  it is handled by the host OS/platform.

  
  [Test Plan]

  I have tested this.

  [Where problems could occur]
  In the 6.11 kernel tree where this change comes from, there has been some 
significant refactoring of the code.  If other changes need to be back-ported 
to the AWS 6.8 kernel, other conflicts or issues may arise.

  
  [Other]
  SF #00395865

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/2080569/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to