Control: tag -1 moreinfo You wrote: > I've encountered an issue booting up Debian Testing with the latest > kernel. The system reports something about kernel panic caused by > aacraid module, then hangs completely during fsck step. Information > about the installed system is in the attached .txt file. Note that > hardware is different, because the SSD with Debian Testing has been > extracted from the server for investigation. > > Using the kernel parameters "pci=nocrs single" I was able to boot into > emergency mode and see the error messages related to kernel panic. > Photos attached. > > Pulling the RAID controller out of the PCI-E slot restores normal boot. [...]
There are (at least) 2 bugs here: 1. Something is preventing aacraid and ehci-hcd from allocating DMA buffers: "ehci-pci 0000:00:1a.0: init 0000:00:1a.0 fail, -12" "aacraid: unable to create mapping." 2. aacraid then double-frees a chunk of memory while handling the failure, causing the panic: "kernel BUG at mm/slub.c:448!" I'm attaching a patch which should fix bug #2, which may help to get more information about bug #1. In principle you should be able to test this by following the instructions at <https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#id-1.6.6.4> but currently the test-patches script has not been updated along with the package and will take a lot more time and space than it should. So instead I would suggest building a custom kernel based on the Debian configuration, following the instructions at <https://kernel-team.pages.debian.net/kernel-handbook/ch-common-tasks.html#s-common-building> and applying this patch before you run "make clean". Let us know if you have any difficulty with this. If the system is able to boot with a patched kernel, please send the full kernel log. Ben. -- Ben Hutchings Knowledge is power. France is bacon.
From 6ef2851f75411b379868119e693ce63440dde869 Mon Sep 17 00:00:00 2001 From: Ben Hutchings <b...@debian.org> Date: Wed, 10 Jul 2024 18:41:07 +0200 Subject: [PATCH] aacraid: Fix double-free on probe failure aac_probe_one() calls hardware-specific init functions through the aac_driver_ident::init pointer, all of which eventually call down to aac_init_adapter(). If aac_init_adapter() fails after allocating memory for aac_dev::queues, it frees the memory but does not clear that member. After the hardware-specific init function returns an error, aac_probe_one() goes down an error path that frees the memory pointed to by aac_dev::queues, resulting.in a double-free. Reported-by: Michael Gordon <m.gordon.zelenobor...@gmail.com> References: https://bugs.debian.org/1075855 Fixes: 8e0c5ebde82b ("[SCSI] aacraid: Newer adapter communication iterface support") Signed-off-by: Ben Hutchings <b...@debian.org> --- drivers/scsi/aacraid/comminit.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/scsi/aacraid/comminit.c b/drivers/scsi/aacraid/comminit.c index bd99c5492b7d..0f64b0244303 100644 --- a/drivers/scsi/aacraid/comminit.c +++ b/drivers/scsi/aacraid/comminit.c @@ -642,6 +642,7 @@ struct aac_dev *aac_init_adapter(struct aac_dev *dev) if (aac_comm_init(dev)<0){ kfree(dev->queues); + dev->queues = NULL; return NULL; } /* @@ -649,6 +650,7 @@ struct aac_dev *aac_init_adapter(struct aac_dev *dev) */ if (aac_fib_setup(dev) < 0) { kfree(dev->queues); + dev->queues = NULL; return NULL; }
signature.asc
Description: This is a digitally signed message part