On Tue, Apr 23, 2019 at 11:09:03AM -0700, Elliott Mitchell wrote: > > One of the things encountered in a fortune file: "Every program has at > least two purposes, one for which it was designed for, and for which it > wasn't designed for." (inexact quote) > > The use case I've got for MMP is quite similar to the one you were > originally thinking of. My case many parameters are different, but the > overall purpose is the same.
Sure, but that means that anything which tries to open or mount the block device --- whether it be dumpe2fs, e2fsck, dump (of dump/restore), debugfs, resize2fs, or the kernel mounting the file system, is going to have to pay the MMP sleep penalty. There really is no way around it, because the assumption here is the only way we can communicate with some other system, or some other VM, which might have read/write access to the device, is via block reads and writes to the device itself, using the MMP block. And this is fundamentally a polling interface, because of the fundamental limitations of the block device, and we don't want to be constantly reading and writing from the block device, because of the performance penalties, and because of the write endurance problems it causes for the SSD. If this is not satisfactory for you.... then you shouldn't be using MMP, and need to be doing something else, like solving this problem at the hypervisor layer. I'd love to be able to offer something better, but I'd also like to have faster of light travel and anti-gravity technology. The laws of physics are very hard to work around, alas.... > > The problem though is while you don't *expect* to do a failover, two > > systems can simultaneously access the device, and so you need to have > > most of this logic. Worse, if your use case includes the hypervisor > > suspending a VM for an hour or two, before resuming it later, MMP > > won't even protect you against the fumble finger case. > > This depends on the hypervisor, but I'm getting the impression Xen (the > one I'm using) gives some strong hints the VM had been paused (otherwise > I wouldn't see kernel messages after suspend/resume). It doesn't matter. If you are depending on MMP to protect you against simultaneously access to an ext4 file system, a suspended VM will cause the MMP protections not to work. If the hypervisor can solve the problem, then great! Don't use MMP.... > I've observed Xen's utilities have /some/ protection against the case I'm > concerned with. They will prevent you from attaching a block device > which is mounted in "Domain-0" (not actually the hypervisor, but a > privileged VM) to a VM. They do not (or did not) protect you from > attempting to mount a block device attached to a VM in "Domain-0". I > also don't know whether Xen's utilities will detect or prevent you from > attaching a block device to multiple VMs. So if these are real block devices in Domain 0 (as opposed to something made up by the hypervisor and presented to its guest VM's), then the O_EXCL/EBUSY mount protections should protect you. I suspect this might mean changing the Xen hypervisor so that it is passing knowledge to whether some VM has a block device opened with O_EXCL is passed down to the Xen layer --- and the Xen layer informing the guest VM if some other VM has the block device opened with O_EXCL. That's what I mean by making it be the hypervisor's problem. You'll be able to solve the problem much more efficiently (with less overhead) and much more robustly, if you solve the problem at the hypervisor control plane level. Cheers, - Ted