[Touch-packages] [Bug 1817097] Comment bridged from LTC Bugzilla

bugproxy Fri, 22 Feb 2019 02:41:14 -0800

------- Comment From christian.r...@de.ibm.com 2019-02-22 05:23 EDT-------
Further investigations revealed that no dm-crypt mapper device is needed at all 
to reproduce the behaviour but just two block devices with different physical 
block sizes, e.g.


# blockdev --getpbsz /dev/mapper/mpatha-part1
512
# blockdev --getpbsz /dev/dasdc1
4096

Mind to first add the SCSI device/the device with the smaller phys. block size 
to the volume group when running the 'vgcreate' command.
# blockdev --getpbsz /dev/mapper/TEST_VG-LV1
512

Use one SCSI disk partition (multipath devices are recommended but not 
required) and one DASD partition to recreate the pvmove problem. Run pvs after 
the move completed, and unmount, mount the fs again.
The fsck.ext4 does not detect any problems on the fs which is unexpected.

Pertaining syslog entries:
Feb 22 11:09:23 system kernel: print_req_error: I/O error, dev dasdc, sector 
280770
Feb 22 11:09:23 system kernel: Buffer I/O error on dev dm-3, logical block 
139265, lost sync page write
Feb 22 11:09:23 system kernel: JBD2: Error -5 detected when updating journal 
superblock for dm-3-8.
Feb 22 11:09:23 system kernel: Aborting journal on device dm-3-8.
Feb 22 11:09:23 system kernel: print_req_error: I/O error, dev dasdc, sector 
280770
Feb 22 11:09:23 system kernel: Buffer I/O error on dev dm-3, logical block 
139265, lost sync page write
Feb 22 11:09:23 system kernel: JBD2: Error -5 detected when updating journal 
superblock for dm-3-8.
Feb 22 11:09:23 system kernel: print_req_error: I/O error, dev dasdc, sector 
2242
Feb 22 11:09:23 system kernel: Buffer I/O error on dev dm-3, logical block 1, 
lost sync page write
Feb 22 11:09:23 system kernel: EXT4-fs (dm-3): I/O error while writing 
superblock
Feb 22 11:09:23 system kernel: EXT4-fs error (device dm-3): ext4_put_super:938: 
Couldn't clean up the journal
Feb 22 11:09:23 system kernel: EXT4-fs (dm-3): Remounting filesystem read-only
Feb 22 11:09:23 system kernel: print_req_error: I/O error, dev dasdc, sector 
2242
Feb 22 11:09:23 system kernel: Buffer I/O error on dev dm-3, logical block 1, 
lost sync page write
Feb 22 11:09:23 system kernel: EXT4-fs (dm-3): I/O error while writing 
superblock
Feb 22 11:09:32 system kernel: EXT4-fs (dm-3): bad block size 1024

The very last syslog line repeats upon 'mount /dev/mapper/TEST_VG-LV1 /mnt ' 
attempts, the 1024 block size is related to
# blockdev --getbsz /dev/mapper/TEST_VG-LV1
1024

After the pvmove the physical blocksize is also changed to
# blockdev --getpbsz /dev/mapper/TEST_VG-LV1
4096

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to lvm2 in Ubuntu.
https://bugs.launchpad.net/bugs/1817097

Title:
  pvmove causes file system corruption without notice upon move from 512
  -> 4096 logical block size devices

Status in Ubuntu on IBM z Systems:
  New
Status in lvm2 package in Ubuntu:
  New

Bug description:
  Problem Description---
  Summary
  =======
  Environment: IBM Z13 LPAR and z/VM Guest
               IBM Type: 2964 Model: 701 NC9
  OS:          Ubuntu 18.10 (GNU/Linux 4.18.0-13-generic s390x)
               Package: lvm2 version 2.02.176-4.1ubuntu3
  LVM: pvmove operation corrupts file system when using 4096 (4k) logical block 
size
       and default block size being 512 bytes in the underlying devices
  The problem is immediately reproducible.

  We see a real usability issue with data destruction as consequence - which is 
not acceptable.
  We expect 'pvmove' to fail with error in such situations to prevent fs 
destruction,
  which might possibly be overridden by a force flag.

  
  Details
  =======
  After a 'pvmove' operation is run to move a physical volume onto an ecrypted
  device with 4096 bytes logical block size we experience a file system 
corruption.
  There is no need for the file system to be mounted, but the problem surfaces
  differently if so.

  Either, the 'pvs' command after the pvmove shows
    /dev/LOOP_VG/LV: read failed after 0 of 1024 at 0: Invalid argument
    /dev/LOOP_VG/LV: read failed after 0 of 1024 at 314507264: Invalid argument
    /dev/LOOP_VG/LV: read failed after 0 of 1024 at 314564608: Invalid argument
    /dev/LOOP_VG/LV: read failed after 0 of 1024 at 4096: Invalid argument

  or

  a subsequent mount shows (after umount if the fs had previously been mounted 
as in our
  setup)
  mount: /mnt: wrong fs type, bad option, bad superblock on 
/dev/mapper/LOOP_VG-LV, missing codepage or helper program, or other error.

  A minimal setup of LVM using one volume group with one logical volume defined,
  based on one physical volume is sufficient to raise the problem. One more 
physical
  volume of the same size is needed to run the pvmove operation to. 

        LV
         |
      VG: LOOP_VG [ ]
         |
      PV: /dev/loop0   -->   /dev/mapper/enc-loop
                          ( backed by /dev/mapper/enc-loop )

  The physical volumes are backed by loopback devices (losetup) to base the
  problem report on, but we have seen the error on real SCSI multipath volumes
  also, with and without cryptsetup mapper devices in use.

  
  Further discussion
  ==================
  https://www.saout.de/pipermail/dm-crypt/2019-February/006078.html
  The problem does not occur on block devices with native size of 4k.
  E.g. DASDs, or file systems with mkfs -b 4096 option.

  
  Terminal output
  ===============
  See attached file pvmove-error.txt

  
  Debug data
  ==========
  pvmove was run with -dddddd (maximum debug level)
  See attached journal file.
   
  Contact Information = christian.r...@de.ibm.com 
   
  ---uname output---
  Linux system 4.18.0-13-generic #14-Ubuntu SMP Wed Dec 5 09:00:35 UTC 2018 
s390x s390x s390x GNU/Linux
   
  Machine Type = IBM Type: 2964 Model: 701 NC9 
   
  ---Debugger---
  A debugger is not configured
   
  ---Steps to Reproduce---
   1.) Create two image files of 500MB in size
      and set up two loopback devices with 'losetup -fP FILE'
  2.) Create one physical volume and one volume group 'LOOP_VG',
      and one logical volume 'VG'
      Run:
      pvcreate /dev/loop0
      vgcreate LOOP_VG /dev/loop0
      lvcreate -L 300MB LOOP_VG -n LV /dev/loop0
  3.) Create a file system on the logical volume device:
      mkfs.ext4 /dev/mapper/LOOP_VG-LV
  4.) mount the file system created in the previous step to some empty 
available directory:
      mount /dev/mapper/LOOP_VG-LV /mnt
  5.) Set up a second physical volume, this time encrypted with LUKS2,
      and open the volume to make it available:
      cryptsetup luksFormat --type luks2 --sector-size 4096 /dev/loop1
      cryptsetup luksOpen /dev/loop1 enc-loop
  6.) Create the second physical volume, and add it to the LOOP_VG
      pvcreate /dev/mapper/enc-loop
      vgextend LOOP_VG /dev/mapper/enc-loop
  7.) Ensure the new physical volume is part of the volume group:
      pvs
  8.) Move the /dev/loop0 volume onto the encrypted volume with maximum debug 
option:
      pvmove -dddddd /dev/loop0 /dev/mapper/enc-loop
  9.) The previous step succeeds, but corrupts the file system on the logical 
volume
       We expect an error here. 
       There might be a command line flag to override used because corruption 
does not cause a data loss.
      
   
  Userspace tool common name: pvmove 
   
  The userspace tool has the following bit modes: 64bit 

  Userspace rpm: lvm2 in versoin 2.02.176-4.1ubuntu3

  Userspace tool obtained from project website:  na 
   
  *Additional Instructions for christian.r...@de.ibm.com:
  -Attach ltrace and strace of userspace application.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-z-systems/+bug/1817097/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

[Touch-packages] [Bug 1817097] Comment bridged from LTC Bugzilla

Reply via email to