Thanks a lot Cascardo and Dann - very good points.

Cascardo: I agree with you, I misunderstood and didn't consider the
minor kernel releases. I think that Dann's idea of testing makes it much
simpler though. Maybe kernel team could create the vmcore images, as
part of the release process, for some random kernels (like generic + 3
cloud kernels randomly) and "dump" into this server. So, makedumpfile
test infrastructure would then consume the vmcores and execute the test,
checking against bugs/regressions. The test could be quite simple, just
checking return value of makedumpfile and if the file created is in fact
a compressed dump (the "file" tool could be used by that).

I agree with you as well Dann, makedumpfile is much more important than
the crash tool and should have priority. Also, testing makedump is
easier than checking the crash tool I guess heheh

Cheers!

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to makedumpfile in Ubuntu.
https://bugs.launchpad.net/bugs/1970672

Title:
  makedumpfile falls back to cp with "__vtop4_x86_64: Can't get a valid
  pmd_pte."

Status in makedumpfile package in Ubuntu:
  New

Bug description:
  [Impact] 
   * On Focal with an HWE (>=5.12) kernel, makedumpfile can sometimes fail with 
"__vtop4_x86_64: Can't get a valid pmd_pte."

   * makedumpfile falls back to cp for the dump, resulting in extremely
  large vmcores. This can impact both collection and analysis due to
  lack of space for the resulting vmcore.

   * This is fixed in upstream commit present in versions 1.7.0 and 1.7.1:
  
https://github.com/makedumpfile/makedumpfile/commit/646456862df8926ba10dd7330abf3bf0f887e1b6

  commit 646456862df8926ba10dd7330abf3bf0f887e1b6
  Author: Kazuhito Hagio <k-hagio...@nec.com>
  Date:   Wed May 26 14:31:26 2021 +0900

      [PATCH] Increase SECTION_MAP_LAST_BIT to 5
      
      * Required for kernel 5.12
      
      Kernel commit 1f90a3477df3 ("mm: teach pfn_to_online_page() about
      ZONE_DEVICE section collisions") added a section flag
      (SECTION_TAINT_ZONE_DEVICE) and causes makedumpfile an error on
      some machines like this:
      
        __vtop4_x86_64: Can't get a valid pmd_pte.
        readmem: Can't convert a virtual address(ffffe2bdc2000000) to physical 
address.
        readmem: type_addr: 0, addr:ffffe2bdc2000000, size:32768
        __exclude_unnecessary_pages: Can't read the buffer of struct page.
        create_2nd_bitmap: Can't exclude unnecessary pages.
      
      Increase SECTION_MAP_LAST_BIT to 5 to fix this.  The bit had not
      been used until the change, so we can just increase the value.
      
      Signed-off-by: Kazuhito Hagio <k-hagio...@nec.com>

  [Test Plan]
   * Confirm that makedumpfile works as expected by triggering a kdump.

   * Confirm that the patched makedumpfile works as expected on a system
  known to experience the issue.

   * Confirm that the patched makedumpfile is able to work with a cp-
  generated known affected vmcore to compress it. The unpatched version
  fails.

  [Where problems could occur]

   * This change could adversely affect the collection/compression of
  vmcores during a kdump situation resulting in fallback to cp.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1970672/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to