> > Ryan, > We believe this is a bug as we expect curtin to wipe the disks. In this > case it's failing to wipe the disks and occasionally that causes issues > with our automation deploying ceph on those disks.
I'm still confused about what the actual error you believe is happening. Note that lvremove is not a fatal error from curtin's perspective because we will be destroying data on the underlying physical disk or partition. Looking at your debug info: 1) your curtin-install.log does not show any failures of lvmremove command 2) if the curtin-install-cfg.yaml is correct, then you've marked wipe: superblock on all of the devices on top of which you build logical volumes. With this setting curtin wipes the logical volume *and* the underlying device. Even if the writes to the lv fail, or if lvremove fails, as long as the underying disk/partition succeed then the LVM metadata and partition table on the disk will be cleared redering the content unusable. Look at sda1 which olds the lv_root LV: shutdown running on holder type: 'lvm' syspath: '/sys/class/block/dm-24' Running command ['dmsetup', 'splitname', 'vgroot-lvroot', '-c', '--noheadings', '--separator', '=', '-o', 'vg_name,lv_name'] with allowed return codes [0] (capture=True) # here we start wiping the logical device by writing 1M of zeros at the # start of the device and at the end of the device Wiping lvm logical volume: /dev/vgroot/lvroot wiping 1M on /dev/vgroot/lvroot at offsets [0, -1048576] # now we remove the lv device and then the vg if it's empty using "lvremove" on vgroot/lvroot Running command ['lvremove', '--force', '--force', 'vgroot/lvroot'] with allowed return codes [0] (capture=False) Logical volume "lvroot" successfully removed Running command ['lvdisplay', '-C', '--separator', '=', '--noheadings', '-o', 'vg_name,lv_name'] with allowed return codes [0] (capture=True) Running command ['pvdisplay', '-C', '--separator', '=', '--noheadings', '-o', 'vg_name,pv_name'] with allowed return codes [0] (capture=True) Running command ['vgremove', '--force', '--force', 'vgroot'] with allowed return codes [0, 5] (capture=False) Volume group "vgroot" successfully removed # now the vg was created from /dev/sda1, here curtin wipes the device with # 1M of zeros at the start and end of this partition Wiping lvm physical volume: /dev/sda1 wiping 1M on /dev/sda1 at offsets [0, -1048576] In the scenario where you see the lvremove command fail, what is the outcome on the system. Does curtin fail the install? Does the install succeed by something after booting into the new system fail? If the latter, what commands fail and can you show the output? -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1871874 Title: lvremove occasionally fails on nodes with multiple volumes and curtin does not catch the failure Status in curtin package in Ubuntu: Incomplete Status in linux package in Ubuntu: Incomplete Bug description: For example: Wiping lvm logical volume: /dev/ceph-db-wal-dev-sdc/ceph-db-dev-sdi wiping 1M on /dev/ceph-db-wal-dev-sdc/ceph-db-dev-sdi at offsets [0, -1048576] using "lvremove" on ceph-db-wal-dev-sdc/ceph-db-dev-sdi Running command ['lvremove', '--force', '--force', 'ceph-db-wal-dev-sdc/ceph-db-dev-sdi'] with allowed return codes [0] (capture=False) device-mapper: remove ioctl on (253:14) failed: Device or resource busy Logical volume "ceph-db-dev-sdi" successfully removed On a node with 10 disks configured as follows: /dev/sda2 / /dev/sda1 /boot /dev/sda3 /var/log /dev/sda5 /var/crash /dev/sda6 /var/lib/openstack-helm /dev/sda7 /var /dev/sdj1 /srv sdb and sdc are used for BlueStore WAL and DB sdd, sde, sdf: ceph OSDs, using sdb sdg, sdh, sdi: ceph OSDs, using sdc across multiple servers this happens occasionally with various disks. It looks like this maybe a race condition maybe in lvm as curtin is wiping multiple volumes before lvm fails To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/curtin/+bug/1871874/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp