On Tue, Apr 23, 2019 at 11:09:03AM -0700, Elliott Mitchell wrote:
> 
> One of the things encountered in a fortune file: "Every program has at
> least two purposes, one for which it was designed for, and for which it
> wasn't designed for."  (inexact quote)
> 
> The use case I've got for MMP is quite similar to the one you were
> originally thinking of.  My case many parameters are different, but the
> overall purpose is the same.

Sure, but that means that anything which tries to open or mount the
block device --- whether it be dumpe2fs, e2fsck, dump (of
dump/restore), debugfs, resize2fs, or the kernel mounting the file
system, is going to have to pay the MMP sleep penalty.  There really
is no way around it, because the assumption here is the only way we
can communicate with some other system, or some other VM, which might
have read/write access to the device, is via block reads and writes to
the device itself, using the MMP block.  And this is fundamentally a
polling interface, because of the fundamental limitations of the block
device, and we don't want to be constantly reading and writing from
the block device, because of the performance penalties, and because of
the write endurance problems it causes for the SSD.

If this is not satisfactory for you.... then you shouldn't be using
MMP, and need to be doing something else, like solving this problem at
the hypervisor layer.  I'd love to be able to offer something better,
but I'd also like to have faster of light travel and anti-gravity
technology.  The laws of physics are very hard to work around, alas....

> > The problem though is while you don't *expect* to do a failover, two
> > systems can simultaneously access the device, and so you need to have
> > most of this logic.  Worse, if your use case includes the hypervisor
> > suspending a VM for an hour or two, before resuming it later, MMP
> > won't even protect you against the fumble finger case.
> 
> This depends on the hypervisor, but I'm getting the impression Xen (the
> one I'm using) gives some strong hints the VM had been paused (otherwise
> I wouldn't see kernel messages after suspend/resume).

It doesn't matter.  If you are depending on MMP to protect you against
simultaneously access to an ext4 file system, a suspended VM will
cause the MMP protections not to work.

If the hypervisor can solve the problem, then great!   Don't use MMP....

> I've observed Xen's utilities have /some/ protection against the case I'm
> concerned with.  They will prevent you from attaching a block device
> which is mounted in "Domain-0" (not actually the hypervisor, but a
> privileged VM) to a VM.  They do not (or did not) protect you from
> attempting to mount a block device attached to a VM in "Domain-0".  I
> also don't know whether Xen's utilities will detect or prevent you from
> attaching a block device to multiple VMs.

So if these are real block devices in Domain 0 (as opposed to
something made up by the hypervisor and presented to its guest VM's),
then the O_EXCL/EBUSY mount protections should protect you.  I suspect
this might mean changing the Xen hypervisor so that it is passing
knowledge to whether some VM has a block device opened with O_EXCL is
passed down to the Xen layer --- and the Xen layer informing the guest
VM if some other VM has the block device opened with O_EXCL.  That's
what I mean by making it be the hypervisor's problem.

You'll be able to solve the problem much more efficiently (with less
overhead) and much more robustly, if you solve the problem at the
hypervisor control plane level.

Cheers,

                                                - Ted

Reply via email to