Cool, I'll try to clean things up and submit a PR (probably one for <Infernalis and one for Infernalis, assuming it's still broken).
I'm 99% certain I ruled out any udev silliness by throwing a bunch of settle calls every place I thought might make some theoretical sense. No number of those calls ever resulted in a properly functioning partprobe call. On Tue, Oct 13, 2015 at 3:31 PM, Loic Dachary <[email protected]> wrote: > Hi, > > On 14/10/2015 00:02, Jeremy Hanmer wrote: >> I think I've found a bug in ceph-disk when running on Ubuntu 14.04 >> (and I believe 12.04 as well, but haven't confirmed) and using >> --dmcrypt. >> >> The problem is that when update_partition() is called, partprobe is >> used to re-read the partition table (as opposed to partx on all other >> distros) and it appears that it isn't smart/thorough enough to update >> all of the device's metadata. Specifically, ID_PART_ENTRY_TYPE isn't >> updated: >> >> root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep >> ID_PART_ENTRY_TYPE >> ID_PART_ENTRY_TYPE=89c57f98-2fe5-4dc0-89c1-5ec00ceff2be >> >> running `partx -u` rather than `partprobe` does the appropriate thing: >> >> root@ceph-osd03:~# partx -u /dev/vdd1 >> root@ceph-osd03:~# udevadm info --query=env --name=/dev/vdd1 | grep >> ID_PART_ENTRY_TYPE >> ID_PART_ENTRY_TYPE=4fbd7e29-9d25-41b8-afd0-5ec00ceff05d >> >> >> I have an experimental patch here that Works For Me, but Sage wanted >> me to ping the list for input: >> >> https://github.com/fzylogic/ceph/commit/8c83f75392d68fbec7def8aa61f20b2c9c237571 >> >> >> I also want to test the new Infernalis code for this same bug (after a >> cursory check, I strongly suspect it's there as well), but it'll take >> a little bit to get another test cluster up to confirm. > > There has been many changes in infernalis, most of them to make it more > robust. It would be great if you could try to reproduce the problem you had > with infernalis. > > Your patch looks good and you could also remove > https://github.com/fzylogic/ceph/blob/8c83f75392d68fbec7def8aa61f20b2c9c237571/src/ceph-disk#L1505 > which will happen immediately after the function returns. > > An alternate fix would be to udevadm settle before > https://github.com/fzylogic/ceph/blob/8c83f75392d68fbec7def8aa61f20b2c9c237571/src/ceph-disk#L985 > and after it to avoid races. I think the reason why partprobe does not > appear to work is because it triggers udev events that race with udev events > triggered by sgdisk while creating the partition. > > Cheers > >> -- >> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in >> the body of a message to [email protected] >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > Loïc Dachary, Artisan Logiciel Libre > -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
