On 5/13/25 21:51, Mario Limonciello wrote:
> On 5/13/2025 2:45 PM, Bjorn Helgaas wrote:
>>  From Denis's report at https://bugzilla.kernel.org/show_bug.cgi?id=220110:
>>
>>> I am having problems with my laptop that has a thunderbolt
>>> controller to which I connected an AMD 6750XT.
>>>
>>> The topology of my system is described in this bug:
>>> https://gitlab.freedesktop.org/drm/amd/-/issues/4014 yet I don't
>>> know if this is related or not.
>>>
>>> I experienced PC attempting to enter s2idle while playing a YT
>>> video; PC has become totally unresponsive to input in any
>>> keyboard/mouse and power button after turning off screens attached
>>> to the AMD card (the built-in screen was off already).
>>>
>>>  From a look at the logs it appears one uncorrectible AER pci error
>>> triggered a pci root reset, and that comes with a bug where the
>>> usage counter assumes a wrong value; this in turn seems to cause all
>>> sorts of weird bugs.
>>>
>>> That however is my interpretation of the attached log, that might be
>>> very wrong.
>>>
>>> This is the first time I experience this bug in a year with this
>>> laptop and I don't know how easy it is to reproduce.
>>>
>>> The kernel has been compiled from sources and it has
>>>
>>>    [PATCH v2] PCI: Explicitly put devices into D0 when initializing
>>>    [PATCH v4] PCI/PM: Put devices to low power state on shutdown
>>>
>>> as I am helping testing things. I find unlikely any of those might
>>> cause these issues especially "PCI: Explicitly put devices into D0
>>> when initializing" that has been there for a few weeks now.
>>>
>>> Thanks in advice to whoever will help me.
>
> From the logs the system didn't actually enter s2idle, but because of the 
> failure to recover after AER he lost the external GPU.
>
> I don't expect that "PCI/PM: Put devices to low power state on shutdown" has 
> anything to do with this issue.  This should only affect system shutdown.  
> (Tangentially related comment; we have another version of this on the 
> linux-pm list now that is more generic [1]).
>
> How readily can this be reproduced?  Can you try to reproduce once more?
> Can this reproduce on an unpatched kernel?
>
I have tried many different of unpatched and patched 6.14.6 for a few hours and 
I could not get this same bug again.

After unsuccessfully attempting to reproduce with the kernel I have been 
running I decided to test the newest "PM_ Use hibernate flows for system power 
off" patch [1].

and that patch seems to help quickly poweroff my laptop when combined with the 
other mentioned patch.

> To confirm if "PCI: Explicitly put devices into D0 when initializing" is the 
> cause can you compare the PCI state of all devices from sysfs with and 
> without the patch in place after bootup?  Basically run this in patched 
> kernel and unpatched kernel and let's compare.
>
> $ grep -v foo /sys/bus/pci/devices/*/power_state
>
>
unpatched: https://pastebin.com/Ym31Vjh6
patched with just "PCI: Explicitly put devices into D0 when initializing": 
https://pastebin.com/SSSWLgcs

diff for easy view: https://www.diffchecker.com/y5GVyEG1/


two devices were D3hot and two were unknown, while now are recognized as D0.


Having those two patches together does not seem to cause any harm and I could 
not reproduce the issue.

I do not believe any of those patches are the cause for the particular crash I 
experienced, however I do believe there is something wrong going on because on 
power on the amdgpu on the thunderbolt card sometimes is there sometimes is not 
and I have to unplug and replug it for it to work.

The only patch that alleviates this particular problem is [2] "[PATCH v3] PCI: 
Prevent power state transition of erroneous device" but it comes with a 
regression where I can no longer wake up the laptop properly.

I will write this in detail as a response to that patch given that was not part 
of the subject here.

[2] 
https://lore.kernel.org/linux-pci/[email protected]/T/#m90fb151a4ab4af5ec8c667a27eb98bf43a9942dc

> [1] 
> https://lore.kernel.org/linux-pm/[email protected]/T/#u

Reply via email to