On 30/12/2024 02:22, Eric Degenetais wrote:
3 - No that I was reasonnably sure the kernel was the culprit, I tried various 
versions using the snapshots repository :
   _ the last kernel to reboot sucessfully was 6.8.12+1
   _ the first kernel to misbehave was 6.9.7+1

As a result, I think that a regression between 6.8.12 and 6.9.7 causes the SSD 
or its controller to hang during the shutdown for reboot process,
so that at warm restart the UEFI firmware cannot detect it. It seems however 
purely volatile state, since complete shutdown then restart works.

git bisect points to the commit below. I am no expert, but it looks indeed related (maybe this combination of controller & drive leads to an unusable state after due to modified power management).

Next step : I'll see if I can build a 6.12.6 + reverted 7627a0edef548c4c4dea62df51cc26bfe5bbcab8 kernel and see if it works.

---------------------------------------------------------------------------------------------------------

7627a0edef548c4c4dea62df51cc26bfe5bbcab8 is the first bad commit
commit 7627a0edef548c4c4dea62df51cc26bfe5bbcab8
Author: Mario Limonciello <mario.limoncie...@amd.com>
Date:   Tue Feb 6 22:13:46 2024 +0100

    ata: ahci: Drop low power policy board type

    The low power policy board type was introduced to allow systems
    to get into deep states reliably.  Before it was introduced `min_power`
    was causing problems for a number of drives.  New power policies
    `min_power_with_partial` and `med_power_with_dipm` have been introduced
    which provide a more stable baseline for systems.

    Tested-by: Damien Le Moal <dlem...@kernel.org>
    Tested-by: Jian-Hong Pan <j...@endlessos.org>
    Acked-by: Jian-Hong Pan <j...@endlessos.org>
    Acked-by: Christoph Hellwig <h...@lst.de>
    Reviewed-by: Damien Le Moal <dlem...@kernel.org>
    Reviewed-by: Mario Limonciello <mario.limoncie...@amd.com>
    Reviewed-by: Mika Westerberg <mika.westerb...@linux.intel.com>
    Suggested-by: Christoph Hellwig <h...@infradead.org>
    Signed-off-by: Mario Limonciello <mario.limoncie...@amd.com>
    [cassel: rebase patch and fix trivial conflicts]
    Signed-off-by: Niklas Cassel <cas...@kernel.org>

 drivers/ata/Kconfig |   5 +--
 drivers/ata/ahci.c  | 109 +++++++++++++++++++++++-----------------------------
 drivers/ata/ahci.h  |   9 ++---

 3 files changed, 53 insertions(+), 70 deletions(-)

Reply via email to