Launchpad has imported 9 comments from the remote bug at
https://bugzilla.redhat.com/show_bug.cgi?id=1701234.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2019-04-18T12:56:02+00:00 rmetrich wrote:

Description of problem:

The blk-availability.service unit is activated automatically when multipathd is 
enabled, even if multipathd is finally not used.
This leads to the blk-availability service to unmount file systems too early, 
breaking unit ordering and leading to shutdown issues of custom services 
requiring some mount points.


Version-Release number of selected component (if applicable):

device-mapper-1.02.149-10.el7_6.3.x86_64


How reproducible:

Always


Steps to Reproduce:

1. Enable multipathd even though there is no multipath device

  # yum -y install device-mapper-multipath
  # systemctl enable multipathd --now

2. Create a custom mount point "/data"

  # lvcreate -n data -L 1G rhel
  # mkfs.xfs /dev/rhel/data
  # mkdir /data
  # echo "/dev/mapper/rhel-data /data xfs defaults 0 0" >> /etc/fstab
  # mount /data

3. Create a custom service requiring mount point "/data"

  # cat > /etc/systemd/system/my.service << EOF
[Unit]
RequiresMountsFor=/data

[Service]
ExecStart=/bin/bash -c 'echo "STARTING"; mountpoint /data; true'
ExecStop=/bin/bash -c 'echo "STOPPING IN 5 SECONDS"; sleep 5; mountpoint /data; 
true'
Type=oneshot
RemainAfterExit=true

[Install]
WantedBy=default.target
EOF
  # systemctl daemon-reload
  # systemctl enable my.service --now

4. Set up persistent journal and reboot

  # mkdir -p /var/log/journal
  # systemctl restart systemd-journald
  # reboot

5. Check the previous boot's shutdown

  # journalctl -b -1 -o short-precise -u my.service -u data.mount -u
blk-availability.service

Actual results:

-- Logs begin at Thu 2019-04-18 12:48:12 CEST, end at Thu 2019-04-18 13:35:50 
CEST. --
Apr 18 13:31:46.933571 vm-blkavail7 systemd[1]: Started Availability of block 
devices.
Apr 18 13:31:48.452326 vm-blkavail7 systemd[1]: Mounting /data...
Apr 18 13:31:48.509633 vm-blkavail7 systemd[1]: Mounted /data.
Apr 18 13:31:48.856228 vm-blkavail7 systemd[1]: Starting my.service...
Apr 18 13:31:48.894419 vm-blkavail7 bash[2856]: STARTING
Apr 18 13:31:48.930270 vm-blkavail7 bash[2856]: /data is a mountpoint
Apr 18 13:31:48.979457 vm-blkavail7 systemd[1]: Started my.service.
Apr 18 13:35:02.544999 vm-blkavail7 systemd[1]: Stopping my.service...
Apr 18 13:35:02.547811 vm-blkavail7 systemd[1]: Stopping Availability of block 
devices...
Apr 18 13:35:02.639325 vm-blkavail7 bash[3393]: STOPPING IN 5 SECONDS
Apr 18 13:35:02.760043 vm-blkavail7 blkdeactivate[3395]: Deactivating block 
devices:
Apr 18 13:35:02.827170 vm-blkavail7 blkdeactivate[3395]: [SKIP]: unmount of 
rhel-swap (dm-1) mounted on [SWAP]
Apr 18 13:35:02.903924 vm-blkavail7 systemd[1]: Unmounted /data.
Apr 18 13:35:02.988073 vm-blkavail7 blkdeactivate[3395]: [UMOUNT]: unmounting 
rhel-data (dm-2) mounted on /data... done
Apr 18 13:35:02.988253 vm-blkavail7 blkdeactivate[3395]: [SKIP]: unmount of 
rhel-root (dm-0) mounted on /
Apr 18 13:35:03.083448 vm-blkavail7 systemd[1]: Stopped Availability of block 
devices.
Apr 18 13:35:07.693154 vm-blkavail7 bash[3393]: /data is not a mountpoint
Apr 18 13:35:07.696330 vm-blkavail7 systemd[1]: Stopped my.service.

--> We can see the following:
- blkdeactivate runs, unmounting /data, even though my.service is still running 
(hence the unexpected message "/data is not a mountpoint")


Expected results:

- my.service gets stopped
- then "data.mount" gets stopped
- finally blkdeactivate runs


Additional info:

I understand there is some chicken-and-egg problem here, but it's just
not possible to blindly unmount file systems and ignore expected unit
ordering.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/0

------------------------------------------------------------------------
On 2019-04-23T13:09:14+00:00 prajnoha wrote:

Normally, I'd add Before=local-fs-pre.target into blk-
availability.service so on shutdown its ExecStop would execute after all
local mount points are unmounted.

The problem might be with all the dependencies like iscsi, fcoe and
rbdmap services where we need to make sure that these are executed
*after* blk-availability. So I need to find a proper target that we can
hook on so that it also fits all the dependencies. It's possible we need
to create a completely new target so we can properly synchronize all the
services on shutdown. I'll see what I can do...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/1

------------------------------------------------------------------------
On 2019-04-23T13:17:39+00:00 rmetrich wrote:

Indeed, wasn't able to find a proper target, none exists.
I believe blk-availability itself needs to be modified to only deactivate 
non-local disks (hopefully there is a way to distinguish).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/2

------------------------------------------------------------------------
On 2019-06-19T13:34:15+00:00 rmetrich wrote:

Hi Peter,

Could you explain why blk-availability is needed when using multipath or iscsi?
With systemd ordering dependencies in units, is that really needed?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/15

------------------------------------------------------------------------
On 2019-06-21T08:43:50+00:00 prajnoha wrote:

(In reply to Renaud Métrich from comment #4)
> Hi Peter,
> 
> Could you explain why blk-availability is needed when using multipath or
> iscsi?
> With systemd ordering dependencies in units, is that really needed?

It is still needed because otherwise there wouldn't be anything else to
properly deactivate the stack. Even though, the blk-availability.service
with blkdeactivate call is still not perfect, it's still better than
nothing and letting systemd to shoot down the devices on its own within
its "last-resort" device deactivation loop that happens in shutdown
initramfs (here, the iscsi/fcoe and all the other devices are already
disconnected anyway, so anything else on top can't be properly
deactivated).

We've just received related report on github too
(https://github.com/lvmteam/lvm2/issues/18).

I'm revisiting this problem now. The correct solution requires more
patching - this part is very fragile at the moment (...easy to break
other functionality).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/17

------------------------------------------------------------------------
On 2019-06-21T08:47:41+00:00 prajnoha wrote:

(In reply to Renaud Métrich from comment #3)
> I believe blk-availability itself needs to be modified to only deactivate
> non-local disks (hopefully there is a way to distinguish).

It's possible that we need to split the blk-availability (and the
blkdeactivate) in two because of this... There is a way to distinguish I
hope (definitely for iscsi/fcoe), but there currently isn't a central
authority to decide on this so it must be done manually (checking
certain properties in sysfs "manually").

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/18

------------------------------------------------------------------------
On 2019-06-21T08:51:28+00:00 rmetrich wrote:

I must be missing something. This service is used to deactivate "remote" block 
devices requiring the network, such as iscsi or fcoe.
Why aren't these services deactivating the block devices by themselves?
That way systemd won't kill everything abruptly.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/20

------------------------------------------------------------------------
On 2019-06-21T09:08:22+00:00 prajnoha wrote:

(In reply to Renaud Métrich from comment #7)
> I must be missing something. This service is used to deactivate "remote"
> block devices requiring the network, such as iscsi or fcoe.

Nope, ALL storage, remote as well as local, if possible. We need to look
at the complete stack (e.g. device-mapper devices which are layered on
top of other layers, are set up locally)

> Why aren't these services deactivating the block devices by
themselves?

Well, honestly, because nobody has ever solved that :)

At the beginning, it probably wasn't that necessary and if you just shut
your system down and let the devices as they are (unattached, not
deactivated), it wasn't such a problem. But now, with various caching
layers, thin pools... it's getting quite important to deactivate the
stack properly to also properly flush any metadata or data.

Of course, we still need to count with the situation where there's a
power outage and the machine is not backed by any other power source so
you'd have your machine shot down immediately (for that there are
various checking and fixing mechanism). But it's certainly better to
avoid this situation as you could still lose some data.

Systemd's loop in the shutdown initramfs is really the last-resort thing
to execute, but we can't rely on that (it's just a loop on device list
with limited loop count, it doesn't look at the real nature of that
layer in the stack).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/21

------------------------------------------------------------------------
On 2019-06-21T09:39:30+00:00 rmetrich wrote:

OK, then we need a "blk-availability-local" service and 
"blk-availability-remote" service and maybe associated targets, similar to 
"local-fs.target" and "remote-fs.target".
Probably this should be handled by systemd package itself, typically by 
analyzing the device properties when a device shows up in udev.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1832859/comments/22


** Changed in: lvm2 (Fedora)
       Status: Unknown => Confirmed

** Changed in: lvm2 (Fedora)
   Importance: Unknown => High

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to lvm2 in Ubuntu.
https://bugs.launchpad.net/bugs/1832859

Title:
  during shutdown libvirt-guests gets stopped after file system unmount

Status in lvm2:
  New
Status in libvirt package in Ubuntu:
  Incomplete
Status in lvm2 package in Ubuntu:
  New
Status in lvm2 package in Fedora:
  Confirmed

Bug description:
  When using automatic suspend at reboot/shutdown, it makes sense to
  store the suspend data on a separate partition to ensure there is
  always enough available space. However this does not work, as the
  partition gets unmounted before or during libvirt suspend.

  Steps to reproduce:

  1. Use Ubuntu 18.04.02 LTS
  2. Install libvirt + qemu-kvm
  3. Start a guest
  4. Set libvirt-guests to suspend at shutdown/reboot by editing 
/etc/default/libvirt-guests
  5. Create a fstab entry to mount a separate partition to mount point 
/var/lib/libvirt/qemu/save. Then run sudo mount /var/lib/libvirt/qemu/save to 
mount the partition.
  6. Reboot

  Expected result:
  The guest suspend data would be written to the /var/lib/libvirt/qemu/save, 
resulting in the data being stored at the partition specified in fstab. At 
boot, this partition would be mounted as specified in fstab and libvirt-guest 
would be able to read the data and restore the guests.

  Actual result:
  The partition gets unmounted before libvirt-guests suspends the guests, 
resulting in the data being stored on the partition containing the root file 
system. During boot, the empty partition gets mounted over the non-empty 
/var/lib/libvirt/qemu/save directory, resulting in libvirt-guests being unable 
to read the saved data.

  As a side effect, the saved data is using up space on the root
  partition even if the directory appears empty.

  Here is some of the relevant lines from the journal:

  Jun 14 00:00:04 libvirt-host blkdeactivate[4343]: Deactivating block devices:
  Jun 14 00:00:04 libvirt-host systemd[1]: Unmounted /var/lib/libvirt/qemu/save.
  Jun 14 00:00:04 libvirt-host blkdeactivate[4343]:   [UMOUNT]: unmounting 
libvirt_lvm-suspenddata (dm-3) mounted on /var/lib/libvirt/qemu/save... done

  Jun 14 00:00:04 libvirt-host libvirt-guests.sh[4349]: Running guests on 
default URI: vps1, vps2, vps3
  Jun 14 00:00:04 libvirt-host blkdeactivate[4343]:   [MD]: deactivating raid1 
device md1... done
  Jun 14 00:00:05 libvirt-host libvirt-guests.sh[4349]: Suspending guests on 
default URI...
  Jun 14 00:00:05 libvirt-host libvirt-guests.sh[4349]: Suspending vps1: ...
  Jun 14 00:00:05 libvirt-host blkdeactivate[4343]:   [LVM]: deactivating 
Volume Group libvirt_lvm... skipping

  Jun 14 00:00:10 libvirt-host libvirt-guests.sh[4349]: Suspending vps1: 5.989 
GiB
  Jun 14 00:00:15 libvirt-host libvirt-guests.sh[4349]: Suspending vps1: ...
  Jun 14 00:00:20 libvirt-host libvirt-guests.sh[4349]: Suspending vps1: ...

To manage notifications about this bug go to:
https://bugs.launchpad.net/lvm2/+bug/1832859/+subscriptions

-- 
Mailing list: https://launchpad.net/~touch-packages
Post to     : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to