On Fri, Jul 12, 2013 at 5:00 PM, Daniel P. Berrange <[email protected]> wrote: > On Fri, Jul 12, 2013 at 02:51:10PM +0100, Daniel P. Berrange wrote: >> We're hitting a problem in libvirt where 'udevadm settle' will get stuck >> in a loop until it eventually times out. Eventually we realized this >> happens when we have any LXC containers active with veth devices in a >> separate network namespace. > > Incidentally, I recall reading something by (iirc) Lennart saying that > apps really should use 'udevadm settle' at all.\
You mean *not*, I guess. There are still valid uses of settle for command line tools, and that will be likely valid in the future too. There is no simple replacement for this barrier to be implemented by simple command line tools. Letting then subscribe to hotplug would ask for too much in quite a few cases. No advanced subsystem or service though should rely or model around settle and make assumptions about "everything is there now", tools should subscribe to udev events and after that enumerate the current devices. Things that pull-in settle at bootup are kind of broken, that is the aspect of seetle you heard from Lennart rightfully complaining, I guess. > Libvirt uses it in a > couple of places, all related to code which obtains lists of storage > devices Which makes sense according to the current state of affairs. Storage tools are only slowly catching up with the reality of devices coming and going all the time on today's systems. They get fixed, and things look at least better today than they have been, but settle is still needed for some operations. > - After adding a disk partition in parted, we use it to wait for > the /dev/sdXXNNN device nodes to all show up Primary device node creation (not symlinks) is synchronous since a couple of years. Devtmps does that for us. The ioctl to add a part table entry, re-read the part table will not return until devtmpfs has created the device nodes. The udev symlinks though might only be available after a settle call. > - After logging into an iscsi target with iscsiadm, we use it to > wait for all the /dev/sdXXX devices nodes associated with the > iSCSI target to appear. > > - After triggering a SCSI HBA rescan via sysfs, we use it to wait > for all the /dev/sdXXX devices nodes associated with the SCI HBA > to appear > > - After creating an NPIV virtual HBA via sysfs, we use it to wait > for all the /dev/sdXXX devices nodes associated with the vHBA > to appear As said, this should all be covered on more recent systems. > - After activating an LVM volume group, we use it to wait for all > the /dev/VGNAME/XXXX device nodes to appear > > - After deleting an LVM volume we use it to wait for the device > node to be removed > > - After adding an LVM volume we use it to wait for the device > node to be added LVM is a story on its own, it's pretty complex, and it slowly gets fixed over time. With the very recent changes it might integrate nicer now. I guess there are still situations though where settle is needed and the simplest solution. All of that applies only to the command line tools again, not for bootup related services, or full-blown storage management services. It is not ok for them to relay on settle. > You can see a pattern there - after doing some action related to > storage, we need to synchronize wrt the creation/deletion of device > nodes in /dev, otherwise we miss out LUNs when we scan for the list > of device nodes associated with a HBA/VolGroup/etc. Any suggestions > for alternative techniques / approaches here ? I think it's fine and is needed for libvirt to use settle. At least as long as it calls the command line tools. There is no generally available storage interface on Linux which would solve all these problems for libvirt, and I don't think you should declare these problems as libvirt problems. Using settle to get a barrier for the tools you need to use which themselves cannot handle async setup and hotplug sounds fine to me. Many of the issues though might already be history with devtmpfs, at least when the primary nodes (and not the symlinks) are used. Kay _______________________________________________ systemd-devel mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/systemd-devel
