** Description changed:

  This is a public version of https://bugs.launchpad.net/bugs/2049792
  
  Backport:  [SRF] performance: hwmon: (coretemp) Fix core count
  limitation  (merged upstream in 6.9) to jammy
  
- [Description]
-   coretemp driver supports at most 128 cores per package. Cores higher than 
128 will lose their core temperature information.
-   Some SRF SKUs have more than 128 cores per package and triggers the issue.
+ [Impact]
+ 
+ In linux 6.8 the coretemp driver supports at most 128 cores per package.
+ Cores higher than 128 will lose their core temperature information.
+ 
+ There is an upstream patch set that allows to support more than 128
+ cores per package, it's applied to linux-next, then to Noble.
+ 
+ We should apply the patch set to the Jammy 5.15 kernel, so that we can
+ properly support systems with a large amount of cores per package.
+ 
+ [Test case]
+ 
+ Read temperature info from /sys/class/hwmon on a system with > 128 cores
+ per package (that means we don't have a proper test case to verify the
+ fix at the moment).
  
  [Fix]
+ 
  A series of patch is part of this improvement:
+ 
  1a793caf6f69 hwmon: (coretemp) Use dynamic allocated memory for core temp_data
  18b24a5f9ca3 hwmon: (coretemp) Remove redundant temp_data->is_pkg_data
  326241f71f3d hwmon: (coretemp) Split package temp_data and core temp_data
  b0b01414a261 hwmon: (coretemp) Abstract core_temp helpers
  87eb801925a0 hwmon: (coretemp) Remove redundant pdata->cpu_map[]
  18d8f5583388 hwmon: (coretemp) Replace sensor_device_attribute with 
device_attribute
  25f8e01baa05 hwmon: (coretemp) Remove unnecessary dependency of array index
  c8c2074020a8 hwmon: (coretemp) Introduce enum for attr index
+ 
  And some patch are required to make the backporting clean:
+ 
  34cf8c657cf03 hwmon: (coretemp) Enlarge per package core count limit
  fdaf0c8629d45 hwmon: (coretemp) Fix bogus core_id to attr name mapping
  4e440abc89458 hwmon: (coretemp) Fix out-of-bounds memory access
  a2930f6dc90f0 hwmon: (coretemp) Delete an obsolete comment
  6c2b659913ad9 hwmon: (coretemp) Delete tjmax debug message
  0f8b916bc5b5d hwmon: (coretemp) avoid RDMSR interrupts to isolated CPUs
  fae30e3c203e0 hwmon: (coretemp) Add support for dynamic ttarget
  c0c67f8761cec hwmon: (coretemp) Add support for dynamic tjmax
  2bc0e6d07ee50 hwmon: (coretemp) rearrange tjmax handing code
  5c0e64dde80ff hwmon: (coretemp) Remove obsolete temp_data->valid
  
- Only 5c0e64dde80ff has to be modified as it's delete a variable which changed 
type
+ Only 5c0e64dde80ff has to be modified as it's deleting a variable which 
changed type
  because of a refactoring.
  
- [Test]
- Verify on specific hardware if we can read temperature accordingly.
+ There is a number of commits, but they are only changing one file.
+ 
+ [Regression potential]
+ 
+ We may experience hwmon-related regressions, either systems reading
+ incorrect temperature information or even bugs/crashes when accessing
+ data from /sys/class/hwmon.

** Changed in: linux (Ubuntu Jammy)
       Status: New => In Progress

** Changed in: linux (Ubuntu Jammy)
     Assignee: (unassigned) => Thibf (thibf)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2058668

Title:
   [SRF] performance: hwmon: (coretemp) Fix core count limitation

Status in linux package in Ubuntu:
  New
Status in linux source package in Jammy:
  In Progress

Bug description:
  This is a public version of https://bugs.launchpad.net/bugs/2049792

  Backport:  [SRF] performance: hwmon: (coretemp) Fix core count
  limitation  (merged upstream in 6.9) to jammy

  [Impact]

  In linux 6.8 the coretemp driver supports at most 128 cores per package.
  Cores higher than 128 will lose their core temperature information.

  There is an upstream patch set that allows to support more than 128
  cores per package, it's applied to linux-next, then to Noble.

  We should apply the patch set to the Jammy 5.15 kernel, so that we can
  properly support systems with a large amount of cores per package.

  [Test case]

  Read temperature info from /sys/class/hwmon on a system with > 128 cores
  per package (that means we don't have a proper test case to verify the
  fix at the moment).

  [Fix]

  A series of patch is part of this improvement:

  1a793caf6f69 hwmon: (coretemp) Use dynamic allocated memory for core temp_data
  18b24a5f9ca3 hwmon: (coretemp) Remove redundant temp_data->is_pkg_data
  326241f71f3d hwmon: (coretemp) Split package temp_data and core temp_data
  b0b01414a261 hwmon: (coretemp) Abstract core_temp helpers
  87eb801925a0 hwmon: (coretemp) Remove redundant pdata->cpu_map[]
  18d8f5583388 hwmon: (coretemp) Replace sensor_device_attribute with 
device_attribute
  25f8e01baa05 hwmon: (coretemp) Remove unnecessary dependency of array index
  c8c2074020a8 hwmon: (coretemp) Introduce enum for attr index

  And some patch are required to make the backporting clean:

  34cf8c657cf03 hwmon: (coretemp) Enlarge per package core count limit
  fdaf0c8629d45 hwmon: (coretemp) Fix bogus core_id to attr name mapping
  4e440abc89458 hwmon: (coretemp) Fix out-of-bounds memory access
  a2930f6dc90f0 hwmon: (coretemp) Delete an obsolete comment
  6c2b659913ad9 hwmon: (coretemp) Delete tjmax debug message
  0f8b916bc5b5d hwmon: (coretemp) avoid RDMSR interrupts to isolated CPUs
  fae30e3c203e0 hwmon: (coretemp) Add support for dynamic ttarget
  c0c67f8761cec hwmon: (coretemp) Add support for dynamic tjmax
  2bc0e6d07ee50 hwmon: (coretemp) rearrange tjmax handing code
  5c0e64dde80ff hwmon: (coretemp) Remove obsolete temp_data->valid

  Only 5c0e64dde80ff has to be modified as it's deleting a variable which 
changed type
  because of a refactoring.

  There is a number of commits, but they are only changing one file.

  [Regression potential]

  We may experience hwmon-related regressions, either systems reading
  incorrect temperature information or even bugs/crashes when accessing
  data from /sys/class/hwmon.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2058668/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to