After some attempt to merge the work needed in LP #1816743 here, we decided to split the bugs and only work the 'reset_devices' addition here.
Cheers, Guilherme ** Summary changed: - Make reset_devices parameter default for kdump and decouple kdump systemd service from the KDUMP_CMDLINE_APPEND + Make reset_devices parameter default for kdump ** Description changed: [Impact] * Kdump does not configure by default the crash kernel to perform a - device reset by default, by passing the "reset_devices" parameter. Also, - the systemd service "kdump-tools-dump" is tightly-coupled with - KDUMP_CMDLINE_APPEND and it shouldn't, to prevent user confusion. + device reset by default, by passing the "reset_devices" parameter. * Kernel has the "reset_devices" parameter that drivers can opt-in, and perform special activity in case this parameter is parsed from command- line. For example, in kdump kernels it hints the drivers that they are booting from a non-healthy condition and needs to issue some form of reset to the adapter, like clearing DMA mapping in their firmware for - example. Users currently (kernel v5.2) are: aacraid, hpsa, ipr, + example. Users currently (kernel v5.5-rc2) are: aacraid, hpsa, ipr, megaraid_sas, mpt3sas, smartpqi, xenbus. This should be enabled by default in the kdump config file to be added in the kdump kernel command-line for all versions. - * The systemd service"kdump-tools-dump" is responsible for triggering the execution of the makedumpfile tool ultimately. Kdump from Xenial+ releases rely on systemd as their init system, so this service is the way to trigger the kdump mechanism. Currently it is configured as any other parameter in KDUMP_CMDLINE_APPEND, meaning if user decides to change the line they need to remember adding the systemd service back. It's not really a parameter that should be easily manipulated in kdump line, since there's no use for it except to instruct systemd to load kdump; the only - reasonable case for removing it is to debug kdump itself. - - [Test Case] - 1) Deploy a Disco VM e.g. with uvt-kvm + 1) Deploy a Bionic VM e.g. with uvt-kvm 2) Install the kdump-tools package 3) Run `kdump-config test`and check for the 'reset_devices' parameter: $ kdump-config test ... kexec command to be used: /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" /var/lib/kdump/vmlinuz - - Also, by changing the KDUMP_CMDLINE_APPEND we can see "systemd.unit - =kdump-tools.service" to be removed. [Regression Potential] The regression potential is low, since it doesn't need any changes in makedumpfile code and we're only adding a parameter on the crash kernel command-line. The risks are related with bad behavior with the kernel when using "reset_devices", like if the driver has bugs in this path. It's considered safer to have the option (and this way prevent problems for booting a unhealthy kernel with potential stuck DMAs in the devices) than not having it. - - Regarding the other change, about the systemd service, it'll only affect - users the are debugging kdump itself and it has no known regression - potential. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to makedumpfile in Ubuntu. https://bugs.launchpad.net/bugs/1800566 Title: Make reset_devices parameter default for kdump Status in makedumpfile package in Ubuntu: In Progress Status in makedumpfile source package in Trusty: Won't Fix Status in makedumpfile source package in Xenial: In Progress Status in makedumpfile source package in Bionic: In Progress Status in makedumpfile source package in Cosmic: Won't Fix Status in makedumpfile source package in Disco: In Progress Status in makedumpfile source package in Eoan: In Progress Status in makedumpfile source package in Focal: In Progress Bug description: [Impact] * Kdump does not configure by default the crash kernel to perform a device reset by default, by passing the "reset_devices" parameter. * Kernel has the "reset_devices" parameter that drivers can opt-in, and perform special activity in case this parameter is parsed from command-line. For example, in kdump kernels it hints the drivers that they are booting from a non-healthy condition and needs to issue some form of reset to the adapter, like clearing DMA mapping in their firmware for example. Users currently (kernel v5.5-rc2) are: aacraid, hpsa, ipr, megaraid_sas, mpt3sas, smartpqi, xenbus. This should be enabled by default in the kdump config file to be added in the kdump kernel command-line for all versions. [Test Case] 1) Deploy a Bionic VM e.g. with uvt-kvm 2) Install the kdump-tools package 3) Run `kdump-config test`and check for the 'reset_devices' parameter: $ kdump-config test ... kexec command to be used: /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-4.15.0-45-generic root=LABEL=cloudimg-rootfs ro console=tty1 console=ttyS0 nr_cpus=1 systemd.unit=kdump-tools.service irqpoll nousb ata_piix.prefer_ms_hyperv=0" /var/lib/kdump/vmlinuz [Regression Potential] The regression potential is low, since it doesn't need any changes in makedumpfile code and we're only adding a parameter on the crash kernel command-line. The risks are related with bad behavior with the kernel when using "reset_devices", like if the driver has bugs in this path. It's considered safer to have the option (and this way prevent problems for booting a unhealthy kernel with potential stuck DMAs in the devices) than not having it. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1800566/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp