There are two risks with that plan that we should overcome. One is testing, such updates should not cause regressions. As of right now, the small testing that makedumpfile receives is not sufficient and gives a lot of false negatives. We should be testing that new kernels are still dumpable (and fix either kernel or makedumpfile when they are not). And test that new makedumpfile versions do not break dumping all the supported kernel versions (which, in my opinion is a little harder, and puts some burden on makedumpfile updates). Users do run outdated kernels and would expect dumps when they crash, so this is a bit of a challenge. We do not need to be perfect and test all kernels in all scenarios, but we definitively need to do better.
The second one is kernel support. It's not unusual that we release an Ubuntu version with a makedumpfile that cannot dump the GA kernel. So, even without considering HWE kernels, an LTS release may need a newer makedumpfile. One of the reasons is that as we don't test as we upload new kernels to the development series, we don't realize makedumpfile needs additional support for that new kernel. Sometimes, just having the latest released makedumpfile is sufficient. But it's too often the case that upstream makedumpfile is only able to catch up with latest kernel releases after a while. Cascardo. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to makedumpfile in Ubuntu. https://bugs.launchpad.net/bugs/1970672 Title: makedumpfile falls back to cp with "__vtop4_x86_64: Can't get a valid pmd_pte." Status in makedumpfile package in Ubuntu: New Bug description: [Impact] * On Focal with an HWE (>=5.12) kernel, makedumpfile can sometimes fail with "__vtop4_x86_64: Can't get a valid pmd_pte." * makedumpfile falls back to cp for the dump, resulting in extremely large vmcores. This can impact both collection and analysis due to lack of space for the resulting vmcore. * This is fixed in upstream commit present in versions 1.7.0 and 1.7.1: https://github.com/makedumpfile/makedumpfile/commit/646456862df8926ba10dd7330abf3bf0f887e1b6 commit 646456862df8926ba10dd7330abf3bf0f887e1b6 Author: Kazuhito Hagio <k-hagio...@nec.com> Date: Wed May 26 14:31:26 2021 +0900 [PATCH] Increase SECTION_MAP_LAST_BIT to 5 * Required for kernel 5.12 Kernel commit 1f90a3477df3 ("mm: teach pfn_to_online_page() about ZONE_DEVICE section collisions") added a section flag (SECTION_TAINT_ZONE_DEVICE) and causes makedumpfile an error on some machines like this: __vtop4_x86_64: Can't get a valid pmd_pte. readmem: Can't convert a virtual address(ffffe2bdc2000000) to physical address. readmem: type_addr: 0, addr:ffffe2bdc2000000, size:32768 __exclude_unnecessary_pages: Can't read the buffer of struct page. create_2nd_bitmap: Can't exclude unnecessary pages. Increase SECTION_MAP_LAST_BIT to 5 to fix this. The bit had not been used until the change, so we can just increase the value. Signed-off-by: Kazuhito Hagio <k-hagio...@nec.com> [Test Plan] * Confirm that makedumpfile works as expected by triggering a kdump. * Confirm that the patched makedumpfile works as expected on a system known to experience the issue. * Confirm that the patched makedumpfile is able to work with a cp- generated known affected vmcore to compress it. The unpatched version fails. [Where problems could occur] * This change could adversely affect the collection/compression of vmcores during a kdump situation resulting in fallback to cp. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1970672/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp