Hi Joshua, I'll check this and give you feedback. Thanks!
On Wed, 22 Mar 2017 at 09:31, Hua Zhang <joshua.zh...@canonical.com> wrote: > @Gustavo, > > Liang's comment in patch set 3 [1] can explain your problem, he said: > > The dev can disappear momentarily right after 'multipath -r "dev"'. So > it doesn't happen for every single path. If it did, it would cause a lot > more issues. The multipath dev removal path reloads the dev near the > beginning of the operation (_rescan_multipath). Thus the "stat" here can > fail if it is executed before the dev node being re-created. > > 1, 'multipath -r' in _rescan_multipath() [10] can make the multipath dev > disappear momentarily [9] due to the bug [9]. > > 2, _get_multipath_device_name() [2] uses 'multipath -ll' command to find > multipath device name [3], so we saw: > > Jan 25 09:24:40 Lock "connect_volume" acquired by > "os_brick.initiator.connector.disconnect_volume" :: waited 0.000s > ... > Jan 25 09:24:40 multipath ['-ll', u'/dev/sdr']: > stdout=360080e5000297ea40000050658885f45 dm-6 NETAPP,INF-01-00#012 > > 3, _linuxscsi.remove_multipath_device() [4] will invoke > remove_multipath_device() [4], so we saw: > > Jan 25 09:24:40 remove multipath device /dev/sdr' > > 4, then find_multipath_device() will be invoked [5], then 'multipath -l' > will be invoked [6] > > 5, the "stat" right after 'multipath -r' here [7] can fail if it is > executed before the dev node being re-created. so we saw: > > Jan 25 09:24:40 Couldn't find multipath device > /dev/mapper/360080e5000297ea40000050658885f45 > > So the fix [1] was trying to fix this problem, but it was abandoned later > because we already have the fix [8], that's also why I am trying to > backport it. > FYI, the root cause of your problem is a bug in multipath-tools [9], you > can also fix the problem by upgrading multipath-tools. > > [1] https://review.openstack.org/#/c/366065 > [2] > https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L925 > [3] > https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L1200 > [4] > https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L935 > [5] > https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L124 > [6] > https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L263 > [7] > https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/linuxscsi.py#L288 > [8] https://review.openstack.org/#/c/374421/ > [9] https://bugs.launchpad.net/ubuntu/+source/multipath-tools/+bug/1621340 > [10] > https://github.com/openstack/os-brick/blob/stable/mitaka/os_brick/initiator/connector.py#L918 > > -- > You received this bug notification because you are subscribed to the bug > report. > https://bugs.launchpad.net/bugs/1623700 > > Title: > [SRU] multipath iscsi does not logout of sessions on xenial > > Status in Ubuntu Cloud Archive: > Fix Released > Status in Ubuntu Cloud Archive mitaka series: > Triaged > Status in Ubuntu Cloud Archive newton series: > Triaged > Status in os-brick: > Fix Released > Status in python-os-brick package in Ubuntu: > Fix Released > Status in python-os-brick source package in Xenial: > In Progress > Status in python-os-brick source package in Yakkety: > Triaged > > Bug description: > [Impact] > > * The reload (multipath -r) in _rescan_multipath can cause > /dev/mapper/<wwid> to be deleted and re-created (bug #1621340 is used > to track this problem), it would cause a lot more downstream openstack > issues. For example, and right in between that, os.stat(mdev) called > by _discover_mpath_device() will fail to find the file. For example, > when detaching a volume the iscsi sessions are not logged out. This > leaves behind a mpath device and the iscsi /dev/disk/by-path devices > as broken luns. So we should stop calling multipath -r when > attaching/detaching iSCSI volumes, multipath will load devices on its > own. > > [Test Case] > > * Enable iSCSI driver and cinder/nova multipath > * Detach a iSCSI volume > * Check that devices/symlinks do not get messed up mentioned below > > [Regression Potential] > > * None > > > stack@xenial-devstack-master-master-20160914-092014:~$ nova > volume-attach 6e1017a7-6dea-418f-ad9b-879da085bd13 > d1d68e04-a217-44ea-bb74-65e0de73e5f8 > +----------+--------------------------------------+ > | Property | Value | > +----------+--------------------------------------+ > | device | /dev/vdb | > | id | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | > | serverId | 6e1017a7-6dea-418f-ad9b-879da085bd13 | > | volumeId | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | > +----------+--------------------------------------+ > > stack@xenial-devstack-master-master-20160914-092014:~$ cinder list > > +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ > | ID | Status | Name | Size | Volume > Type | Bootable | Attached to | > > +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ > | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | in-use | - | 1 | > pure-iscsi | false | 6e1017a7-6dea-418f-ad9b-879da085bd13 | > > +--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+ > > stack@xenial-devstack-master-master-20160914-092014:~$ nova list > > +--------------------------------------+------+--------+------------+-------------+---------------------------------+ > | ID | Name | Status | Task State | > Power State | Networks | > > +--------------------------------------+------+--------+------------+-------------+---------------------------------+ > | 6e1017a7-6dea-418f-ad9b-879da085bd13 | test | ACTIVE | - | > Running | public=172.24.4.12, 2001:db8::b | > > +--------------------------------------+------+--------+------------+-------------+---------------------------------+ > > stack@xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m > session > tcp: [5] 10.0.1.10:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [6] 10.0.5.10:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [7] 10.0.1.11:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [8] 10.0.5.11:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > stack@xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m > node > 10.0.1.11:3260,-1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 > 10.0.5.11:3260,-1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 > 10.0.5.10:3260,-1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 > 10.0.1.10:3260,-1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 > > stack@xenial-devstack-master-master-20160914-092014:~$ sudo tail -f > /var/log/syslog > Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get udev > uid: Invalid argument > Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sysfs > uid: Invalid argument > Sep 14 22:33:14 xenial-qemu-tester multipath: dm-0: failed to get sgio > uid: No such file or directory > Sep 14 22:33:14 xenial-qemu-tester systemd[1347]: > dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev > dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared > twice with different sysfs paths > /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and > /sys/devices/virtual/block/dm-0 > Sep 14 22:33:14 xenial-qemu-tester systemd[1347]: > dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev > dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared > twice with different sysfs paths > /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and > /sys/devices/virtual/block/dm-0 > Sep 14 22:33:14 xenial-qemu-tester systemd[1]: > dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device: Dev > dev-disk-by\x2did-scsi\x2d3624a93709a738ed78583fd12003fb774.device appeared > twice with different sysfs paths > /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and > /sys/devices/virtual/block/dm-0 > Sep 14 22:33:14 xenial-qemu-tester systemd[1]: > dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device: Dev > dev-disk-by\x2did-wwn\x2d0x624a93709a738ed78583fd12003fb774.device appeared > twice with different sysfs paths > /sys/devices/platform/host6/session5/target6:0:0/6:0:0:1/block/sda and > /sys/devices/virtual/block/dm-0 > Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.163521] audit: > type=1400 audit(1473892394.556:21): apparmor="STATUS" > operation="profile_replace" profile="unconfined" > name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13" pid=32665 > comm="apparmor_parser" > Sep 14 22:33:14 xenial-qemu-tester kernel: [22362.173614] audit: > type=1400 audit(1473892394.568:22): apparmor="STATUS" > operation="profile_replace" profile="unconfined" > name="libvirt-6e1017a7-6dea-418f-ad9b-879da085bd13//qemu_bridge_helper" > pid=32665 comm="apparmor_parser" > Sep 14 22:33:14 xenial-qemu-tester iscsid: Connection8:0 to [target: > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873, portal: > 10.0.5.11,3260] through [iface: default] is operational now > > stack@xenial-devstack-master-master-20160914-092014:~$ nova > volume-detach 6e1017a7-6dea-418f-ad9b-879da085bd13 > d1d68e04-a217-44ea-bb74-65e0de73e5f8 > stack@xenial-devstack-master-master-20160914-092014:~$ sudo iscsiadm -m > session > tcp: [5] 10.0.1.10:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [6] 10.0.5.10:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [7] 10.0.1.11:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [8] 10.0.5.11:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > > stack@xenial-devstack-master-master-20160914-092014:~$ cinder list > > +--------------------------------------+-----------+------+------+-------------+----------+-------------+ > | ID | Status | Name | Size | > Volume Type | Bootable | Attached to | > > +--------------------------------------+-----------+------+------+-------------+----------+-------------+ > | d1d68e04-a217-44ea-bb74-65e0de73e5f8 | available | - | 1 | > pure-iscsi | false | | > > +--------------------------------------+-----------+------+------+-------------+----------+-------------+ > > stack@xenial-devstack-master-master-20160914-092014:~$ iscsiadm -m > session > tcp: [5] 10.0.1.10:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [6] 10.0.5.10:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [7] 10.0.1.11:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > tcp: [8] 10.0.5.11:3260,1 > iqn.2010-06.com.purestorage:flasharray.3adbe40b49bac873 (non-flash) > > stack@xenial-devstack-master-master-20160914-092014:~$ sudo tail -f > /var/log/syslog > Sep 14 22:48:10 xenial-qemu-tester kernel: [23257.736455] > connection6:0: detected conn error (1020) > Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742036] > connection5:0: detected conn error (1020) > Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742066] > connection7:0: detected conn error (1020) > Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742139] > connection8:0: detected conn error (1020) > Sep 14 22:48:13 xenial-qemu-tester kernel: [23260.742156] > connection6:0: detected conn error (1020) > Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747638] > connection5:0: detected conn error (1020) > Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747666] > connection7:0: detected conn error (1020) > Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747710] > connection8:0: detected conn error (1020) > Sep 14 22:48:16 xenial-qemu-tester kernel: [23263.747737] > connection6:0: detected conn error (1020) > Sep 14 22:48:16 xenial-qemu-tester iscsid: message repeated 67 times: [ > conn 0 login rejected: initiator failed authorization with target] > Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.753999] > connection6:0: detected conn error (1020) > Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754019] > connection8:0: detected conn error (1020) > Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754105] > connection5:0: detected conn error (1020) > Sep 14 22:48:19 xenial-qemu-tester kernel: [23266.754146] > connection7:0: detected conn error (1020) > > To manage notifications about this bug go to: > https://bugs.launchpad.net/cloud-archive/+bug/1623700/+subscriptions > -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1623700 Title: [SRU] multipath iscsi does not logout of sessions on xenial To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1623700/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs