I believe we have been seeing this same issue on two systems with those
same drives (Samsung SSD 990 PRO with Heatsink 1TB, firmware 4B2QJXD7).
I have just applied the "nvme_core.default_ps_max_latency_us=0"
workaround on one system and will see if it has any effect.

I agree with Jon; it is hard to spot this issue without remote logging,
since the SSD stops accepting writes until the system is forcibly
restarted.

I am not sure if this is a hardware issue or a software issue, but it
does seem to be increasingly well documented with this particular drive.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2097618

Title:
  NVME Unable to change power state from D3cold to D0, device
  inaccessible

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  Hi, all access to NVMe SSD fails "randomly" (typically overnight).
  (SSD: Samsung SSD 990 PRO with Heatsink 1TB - latest FW: 4B2QJXD7)

  Through remote logging I acquired the kernel.log, suggesting a problem
  related to power saving:

  Feb  7 01:56:23 jonry-NUC7i kernel: nvme nvme0: controller is down; will 
reset: CSTS=0xffffffff, PCI_STATUS=0xffff
  Feb  7 01:56:23 jonry-NUC7i kernel: nvme nvme0: Does your device have a 
faulty power saving mode enabled?
  Feb  7 01:56:23 jonry-NUC7i kernel: nvme nvme0: Try 
"nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off" and 
report a bug
  Feb  7 01:56:23 jonry-NUC7i kernel: nvme 0000:01:00.0: Unable to change power 
state from D3cold to D0, device inaccessible
  Feb  7 01:56:23 jonry-NUC7i kernel: nvme nvme0: Disabling device after reset 
failure: -19

  With this, the disk has become unaccessible and the system keeps
  throwing I/O errors until I force a reboot.

  All the best,
  Jon Ivar

  ProblemType: Bug
  DistroRelease: Ubuntu 24.04
  Package: linux-image-6.8.0-52-generic 6.8.0-52.53
  ProcVersionSignature: Ubuntu 6.8.0-52.53-generic 6.8.12
  Uname: Linux 6.8.0-52-generic x86_64
  ApportVersion: 2.28.1-0ubuntu3.1
  Architecture: amd64
  AudioDevicesInUse:
   USER        PID ACCESS COMMAND
   /dev/snd/seq:        jon        4264 F.... pipewire
   /dev/snd/controlC0:  jon        4270 F.... wireplumber
  CasperMD5CheckResult: unknown
  CurrentDesktop: ubuntu:GNOME
  Date: Fri Feb  7 14:42:15 2025
  EcryptfsInUse: Yes
  HibernationDevice: RESUME=UUID=d4767a08-b64e-455e-ae17-5e9b0e7d40ae
  InstallationDate: Installed on 2018-04-25 (2480 days ago)
  InstallationMedia: Ubuntu 16.04.4 LTS "Xenial Xerus" - Release amd64 
(20180228)
  MachineType: ASUSTeK COMPUTER INC. NUC14RVH-B
  ProcEnviron:
   LANG=nb_NO.UTF-8
   LANGUAGE=nb_NO:nb:no_NO:no:nn_NO:nn:en
   PATH=(custom, no user)
   SHELL=/bin/bash
   XDG_RUNTIME_DIR=<set>
  ProcFB:
   0 simpledrmdrmfb
   1 i915drmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-6.8.0-52-generic 
root=UUID=04365e12-2b3f-4616-8ecd-7df28b7a87c2 ro 
nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off quiet 
splash nvme_core.default_ps_max_latency_us=0 pcie_aspm=off pcie_port_pm=off 
vt.handoff=7
  PulseList: Error: command ['pacmd', 'list'] failed with exit code 1: No 
PulseAudio daemon running, or not running as session daemon.
  RelatedPackageVersions:
   linux-restricted-modules-6.8.0-52-generic N/A
   linux-backports-modules-6.8.0-52-generic  N/A
   linux-firmware                            20240318.git3b128b60-0ubuntu2.5
  SourcePackage: linux
  UpgradeStatus: Upgraded to noble on 2024-10-29 (101 days ago)
  dmi.bios.date: 08/09/2024
  dmi.bios.release: 5.32
  dmi.bios.vendor: ASUSTeK COMPUTER INC.
  dmi.bios.version: RVMTL357.0044.2024.0809.0954
  dmi.board.name: NUC14RVB
  dmi.board.vendor: ASUSTeK COMPUTER INC.
  dmi.board.version: 60AS0080-MB2A01
  dmi.chassis.type: 35
  dmi.chassis.vendor: ASUSTeK COMPUTER INC.
  dmi.chassis.version: 2.0
  dmi.ec.firmware.release: 3.5
  dmi.modalias: 
dmi:bvnASUSTeKCOMPUTERINC.:bvrRVMTL357.0044.2024.0809.0954:bd08/09/2024:br5.32:efr3.5:svnASUSTeKCOMPUTERINC.:pnNUC14RVH-B:pvr90AR0072-M001P0:rvnASUSTeKCOMPUTERINC.:rnNUC14RVB:rvr60AS0080-MB2A01:cvnASUSTeKCOMPUTERINC.:ct35:cvr2.0:skuNUC14RVH-B:
  dmi.product.family: RV
  dmi.product.name: NUC14RVH-B
  dmi.product.sku: NUC14RVH-B
  dmi.product.version: 90AR0072-M001P0
  dmi.sys.vendor: ASUSTeK COMPUTER INC.
  modified.conffile..etc.init.d.apport: [modified]
  mtime.conffile..etc.init.d.apport: 2024-07-22T16:59:07

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2097618/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to