On Tue, Jul 2, 2024 at 5:17 AM Dmitrii Odintcov <cyprussocial...@gmail.com> wrote: > > Hi, > > > Just had it happen to me again - on boot this time - with the > recommended `nvme_core.default_ps_max_latency_us=0 pcie_aspm=off`. > > Worth noting that it's only happening with one of two SSDs I have > installed - the other being Samsung SSD 980 PRO 2TB - and that they've > been working fine for several years until this started happening > sometime last month.
One thing I noticed is when I boot without any special kernel parameters I was seeing the following, I've read that this error can be due to a buggy BIOS: $ lspci | grep -i 1b.4 00:1b.4 PCI bridge: Intel Corporation Alder Lake-S PCH PCI Express Root Port (rev 11) Jul 2 14:09:00 x kernel: [ 1213.547609] pcieport 0000:00:1b.4: AER: Multiple Corrected error message received from 0000:00:1b.4 Jul 2 14:09:00 x kernel: [ 1213.547843] pcieport 0000:00:1b.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Jul 2 14:09:00 x kernel: [ 1213.548011] pcieport 0000:00:1b.4: device [8086:7ac4] error status/mask=00000040/00002000 Jul 2 14:09:00 x kernel: [ 1213.548182] pcieport 0000:00:1b.4: [ 6] BadTLP Jul 2 14:09:43 x kernel: [ 1256.906830] pcieport 0000:00:1b.4: AER: Multiple Corrected error message received from 0000:00:1b.4 Jul 2 14:09:43 x kernel: [ 1256.907169] pcieport 0000:00:1b.4: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID) Jul 2 14:09:43 x kernel: [ 1256.907395] pcieport 0000:00:1b.4: device [8086:7ac4] error status/mask=00000040/00002000 Jul 2 14:09:43 x kernel: [ 1256.907648] pcieport 0000:00:1b.4: [ 6] BadTLP I am now testing with pci=nommconf, so far it has been 20 hours and no crash yet nor Corrected errors as I showed above, but I need to give it more time. I am curious if there is any improvement on your side if only testing with pci=nommconf ? Justin