On Wed, 2026-02-04 at 20:03 -0500, Benjamin Marzinski wrote: > On Thu, Feb 05, 2026 at 12:57:38AM +0100, Hannes Reinecke wrote: > > On 2/4/26 19:32, Stefan Hajnoczi wrote: > > > On Wed, Feb 04, 2026 at 02:19:48PM +0100, Martin Wilck wrote: > > > > Hi Stefan, > > > > > > > > On Tue, 2026-02-03 at 13:04 -0500, Stefan Hajnoczi wrote: > > [ .. ]>>> > > > > > It can be generic. The messages will contain the block device > > > > > major:minor as well as information to describe <linux/pr.h> > > > > > requests. > > > > > > > > So the ioctls will pass through qemu into the kernel, to be > > > > intercepted > > > > by the dm-mpath driver, which will use an upcall to have them > > > > handled > > > > by mpathpersistd (for the actual command) and multipathd (for > > > > the path > > > > registrations). > > > > > > > > I don't fully understand the advantage, security and > > > > complexity-wise, > > > > of this concept, compared to intercepting them qemu and using a > > > > socket > > > > to talk to mpathpersistd directly. If we did this, we could > > > > even > > > > support both generic and SCSI PR commands. > > > > > > Hi Martin, > > > The simplification and security benefits are on the application > > > side, > > > not on the DM-Multipath side, so I can see what you're getting > > > at. From > > > the DM-Multipath perspective things get a little more complex. > > > > > > From an application perspective, a single API that works across > > > block > > > device types (SCSI, NVMe, DM-Multipath) and requires no > > > privileges or > > > sockets (they are a pain in container environments) is the most > > > convenient. The <linux/pr.h> ioctl API offers exactly this. > > > > > > Unfortunately, DM-Multipath currently does not fully support > > > <linux/pr.h>. It sends PR operations down each path, but that is > > > only a > > > subset of libmpathpersist's logic and multipathd is not kept in > > > sync. > > > > > > My impression is that libmpathpersist and multipathd logic cannot > > > be > > > easily moved into the kernel. This is where the upcall idea comes > > > from. > > > Let's notify multipath-tools from DM-Multipath so it can do its > > > work in > > > userspace. > > > > > It _might_ be possible by extending the current path-switching > > code in the kernel to keep track of PRs. The we could move the > > registration upon path switching, and (ideally) could do away > > with upcalls. > > Not sure, though, how targets react when having to deal with a > > flood of PR commands ... > > But maybe worth a try. > > Making a multipath device pretend to be single Persistently > Reservable > device involves a lot of ugly workarounds that I'm not really excited > to > see in the kernel. > > For instance, every time a new path appears or a path that was down > when > the device was registered comes up, multipath needs to register that > path. But a preempt could come it while it is doing this (or indeed > any > time after multipath registered the other paths). So it has to check > the that the registrations are still there on the other paths before > registering the new path, and then check again afterwards to make > sure > that there wasn't a preempt during the registration. > > Worse, you can't release a reservation from a path that is down. If > multipath needs to release its reservation, and the path that is > holding > it is down, the only solution I could come up with is to suspend the > device so no IO happens. Preempt the reservation to move it to an > active > path, which wipes the registrations off all the other paths. Then > reregister the all the active paths again, and unsuspend the device. > The failed paths will get reregistred as they come back up. > > And there's more cases like these. They are, of course, just as > doable > in the kernel as in userspace, but it's a lot of persistent > reservation > code to put into the multipath target.
Having reviewed (or tried to do so) Ben's code for handling the various corner cases, I agree that we don't want to start all over with this. Martin
