** Also affects: crash (Ubuntu Noble)
Importance: Undecided
Status: New
** Also affects: makedumpfile (Ubuntu Noble)
Importance: Undecided
Status: New
** Also affects: crash (Ubuntu Plucky)
Importance: Undecided
Status: New
** Also affects: makedumpfile (Ubuntu Plucky)
Importance: Undecided
Status: New
** Also affects: crash (Ubuntu Questing)
Importance: Undecided
Status: New
** Also affects: makedumpfile (Ubuntu Questing)
Importance: Undecided
Status: New
** Also affects: crash (Ubuntu Resolute)
Importance: Undecided
Status: Confirmed
** Also affects: makedumpfile (Ubuntu Resolute)
Importance: Undecided
Status: Confirmed
** Changed in: makedumpfile (Ubuntu Plucky)
Status: New => Fix Released
** Changed in: makedumpfile (Ubuntu Questing)
Status: New => Fix Released
** Changed in: makedumpfile (Ubuntu Resolute)
Status: Confirmed => Fix Released
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to crash in Ubuntu.
https://bugs.launchpad.net/bugs/2125145
Title:
[WIP] [SRU] Makedumpfile: Errors and Page Exclusions When Opening
Kernel Crashdump Files Generated on the Latest HWE Kernel
Status in crash package in Ubuntu:
Confirmed
Status in makedumpfile package in Ubuntu:
Fix Released
Status in crash source package in Noble:
New
Status in makedumpfile source package in Noble:
New
Status in crash source package in Plucky:
New
Status in makedumpfile source package in Plucky:
Fix Released
Status in crash source package in Questing:
New
Status in makedumpfile source package in Questing:
Fix Released
Status in crash source package in Resolute:
Confirmed
Status in makedumpfile source package in Resolute:
Fix Released
Bug description:
Note: Original description is at the bottom of this report
[Impact]
The current versions of Makedumpfile and Crash in the -updates pocket
on Noble do not support the latest hardware enablement kernel for that
platform, which is 6.14. There are several architecture-dependent and
kernel flavor-dependent behaviours that I will outline below, but the
steps to reproduce are the same.
Reproducer steps:
-----------------
Boot into a hardware enablement kernel. For example, on arm64 use the
6.14.0-1008-nvidia-64k kernel:
KERNEL_VERSION=6.14.0-1008-nvidia-64k
DISTRO=noble
sudo apt update
sudo apt install ubuntu-dbgsym-keyring
echo "deb http://ddebs.ubuntu.com ${DISTRO} main restricted universe
multiverse
deb http://ddebs.ubuntu.com ${DISTRO}-updates main restricted universe
multiverse | \
sudo tee /etc/apt/sources.list.d/ddebs.list
sudo apt update
sudo apt install linux-image-${KERNEL_VERSION}
sudo apt install linux-image-unsigned-${KERNEL_VERSION}-dbgsym
Modify grub's cmdline to specify a crashkernel:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash crashkernel=512M" # Or similar
sudo update-grub
sudo apt install kexec-tools kdump-tools crash makedumpfile
sudo systemctl enable kdump-tools
sudo systemctl start kdump-tools
sudo reboot
echo c | sudo tee /proc/sysrq-trigger
Results on Arm64
----------------
After the machine recovers,
crash /usr/lib/debug/boot/vmlinux-6.14.0-1008-nvidia-64k
/var/crash/<dump-dir>/<dump-file>
crash 8.0.4
Copyright (C) 2002-2022 Red Hat, Inc.
...
For help, type "help".
Type "apropos word" to search for commands related to "word"...
please wait... (gathering task table data)
crash: page excluded: kernel virtual address: ffff07ffa042d8e0 type:
"xa_node.slots[off]"
Results on amd64
----------------
On an amd64 machine, using a kernel such as linux-
image-6.14.0-29-generic results in crash failing to open. No error is
printed but we don't obtain the prompt:
crash /usr/lib/debug/boot/vmlinux-6.14.0-29-generic
/var/crash/202509112049/dump.202509112049
crash 8.0.4
...
For help, type "help".
Type "apropos word" to search for commands related to "word"...
# Program exits and no prompt is presented
[Investigation]
We have identified that on the Makedumpfile at least two commits are needed:
[1]
https://github.com/makedumpfile/makedumpfile/commit/985e575253f1c2de8d6876cfe685c68a24ee06e1
[2]
https://github.com/makedumpfile/makedumpfile/commit/bad2a7c4fa75d37a41578441468584963028bdda
These are patches to compensate for a change in the kernel's mapping
of memory. Using the patched Makedumpfile helps, but it is not
sufficient. Including the patches in Makedumpfile (or using the tip of
upstream master), but opening with the currently distributed crash
results in the following errors:
eg. Patched Makedumpfile with crash 8.0.4 on Arm64:
---------------------------------------------------
...
WARNING: cannot determine starting stack frame for task ffffd574e21b4800
WARNING: cannot determine starting stack frame for task
ffff07ff83296300
WARNING: cannot determine starting stack frame for task
ffff07ff83293f80
WARNING: cannot determine starting stack frame for task
ffff07ff83a04700
WARNING: cannot determine starting stack frame for task ffff08010507c400
KERNEL: /usr/lib/debug/boot/vmlinux-6.14.0-1008-nvidia-64k
DUMPFILE: /var/crash/patched_mdf/dump.202509191531 [PARTIAL DUMP]
CPUS: 128 [OFFLINE: 127]
DATE: Thu Jan 1 00:00:00 UTC 1970
UPTIME: 00:13:38
LOAD AVERAGE: 0.12, 0.16, 0.10
TASKS: 1573
NODENAME: penguru
RELEASE: 6.14.0-1008-nvidia-64k
VERSION: #8-Ubuntu SMP PREEMPT_DYNAMIC Sat Jul 26 02:43:53 UTC 2025
MACHINE: aarch64 (unknown Mhz)
MEMORY: 63.8 GB
PANIC: "Kernel panic - not syncing: sysrq triggered crash"
PID: 7886
COMMAND: "tee"
TASK: ffff08010507c400 [THREAD_INFO: ffff08010507c400]
CPU: 85
STATE: TASK_RUNNING (PANIC)
On Amd64
--------
Crash still fails to open.
Therefore, in addition to the above Makedumpfile commits, crash
requires some patching. With the above two commits to Makedumpfile I
did a bisect on crash on amd64 and arm64.
On the amd64 crash side, I have identified that [3] applied in isolation
(cherry-picked) is sufficient on amd64
[3]
https://github.com/crash-utility/crash/commit/6752571d8d782d07537a258a1ec8919ebd1308ad
I have also found that cherry-picking [4] and [5] resolves the issue on arm64
hardware in testflinger (using the machine agent penguru)
[4]
https://github.com/crash-utility/crash/commit/3879e9104826d5ae14a0824ec47ab60056a249a7
[5]
https://github.com/crash-utility/crash/commit/968debd0d5979dd9ddca3af0766bad714dbd51e3
At this point, crash's commands such as mount, files, vm, etc. were
still broken. To resolve this, [6] and [7] are needed
[6]
https://github.com/crash-utility/crash/commit/3d60d9d40457239683a5f20b01437db94f964fb8
[7]
https://github.com/crash-utility/crash/commit/2795136a515446b798ebbfa257c97f0ca6ecb8ec
[Test Plan]
* Ensure that with the proposed combination of makedumpfile and crash
is capable of generating and subsequently opening crashdumps on the
latest HWE kernels as well as the GA kernels on arm64 and amd64 (ATOW:
6.14 and 6.18, respectively).
* Ensure all of crash's commands produce the expected output (eg. ps,
mount, files, vm, vtop, etc.)
* If bugs are found in generating and reading crashdumps on the HWE
kernel on other architectures (s390x, etc.), this test plan can be
expanded to include those.
[Where Problems Could Occur]
* Crash and Makedumpfile are designed to be backwards-compatible, so the risk
of regression is low - however, not zero. This is why it will be important to
ensure that the proposed combination of Makedumpfile and crash does not break
existing environments - for example the GA kernel
* The matrix of hardware and kernel versions (including derivative /
cloud kernels) to test again is extensive. It's possible that the
commits identified to solve the known problems will not be
comprehensive. For example, a different cpu architecture with a
different kernel may require additional commits to be backported.
[Other Info]
* Support/SEG are currently having conversations with the kernel team
about the potential to proactively SRU / MRE the latest upstream crash
version, and potentially Makedumpfile as well, alongside -hwe kernel
releases to avoid this sort of regression in the future. Though, we
understand this would require an SRUExceptionPolicy to be approved and
published.
Original Description:
=====================
24.04 LTS,
Linux 6.14.0-29-generic #29~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Aug 14
16:52:50 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Problem Description:
crash utility is crashing (error code 1) when attempting to analyze kernel
crash dumps.
Setup kdump & generated kernel panic using “echo 1 >
/proc/sys/kernel/sysrq” but, crash cannot access it:
# crash /usr/lib/debug/boot/vmlinux-6.14.0-29-generic
dump.202509161821
crash 8.0.4
Copyright (C) 2002-2022 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2022 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
Copyright (C) 2015, 2021 VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.
GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
# echo $?
1
running as root user and file is readable fine:
$ :/var/crash/202509161821# ls -l
total 299144
-rw------- 1 root whoopsie 119627 Sep 16 18:21 dmesg.202509161821
-rw-r--r-- 1 root whoopsie 306200163 Sep 16 18:21 dump.202509161821
symbol file is there:
# ls -l /usr/lib/debug/boot/vmlinux-6.14.0-29-generic*
-rw-r--r-- 1 root root 450705920 Aug 14 18:02
/usr/lib/debug/boot/vmlinux-6.14.0-29-generic
tail of strace:
14:06:20.661240 rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[],
sa_flags=SA_RESTORER|SA_NODEFER, sa_restorer=0x7b0841845330}, NULL, 8) = 0
<0.000008>
14:06:20.661281 rt_sigaction(SIGINT, {sa_handler=0x5ec383cbceb0, sa_mask=[],
sa_flags=SA_RESTORER|SA_NODEFER, sa_restorer=0x7b0841845330}, NULL, 8) = 0
<0.000008>
14:06:20.661322 rt_sigaction(SIGSEGV, {sa_handler=SIG_DFL, sa_mask=[],
sa_flags=SA_RESTORER|SA_NODEFER, sa_restorer=0x7b0841845330}, NULL, 8) = 0
<0.000008>
14:06:20.661360 write(1, "\n", 1
) = 1 <0.000119>
14:06:20.661579 lseek(3, 10312, SEEK_SET) = 10312 <0.000010>
14:06:20.661617 read(3, "OSRELEASE=6.14.0-29-generic\nBUIL"..., 3276) = 3276
<0.000011>
14:06:20.661748 unlink("/var/tmp/ramdump_elf_XXXXXX") = -1 ENOENT (No such
file or directory) <0.002921>
14:06:20.664817 exit_group(1) = ?
14:06:20.690105 +++ exited with 1 +++
full crash strace https://filebin.net/custom-bin/crash.strace.1
ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: crash 8.0.4-1ubuntu2
ProcVersionSignature: Ubuntu 6.14.0-29.29~24.04.1-generic 6.14.8
Uname: Linux 6.14.0-29-generic x86_64
ApportVersion: 2.28.1-0ubuntu3.8
Architecture: amd64
CasperMD5CheckResult: pass
Date: Thu Sep 18 20:21:26 2025
InstallationDate: Installed on 2025-09-04 (14 days ago)
InstallationMedia: Ubuntu 24.04.2 LTS "Noble Numbat" - Release amd64
(20250215)
ProcEnviron:
LANG=en_US.UTF-8
PATH=(custom, no user)
SHELL=/bin/bash
TERM=xterm-256color
SourcePackage: crash
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/crash/+bug/2125145/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp