Servers with two controllers. The second one disappear (with a kernel trace).
> cat /proc/version Linux version 4.4.0-47-generic (buildd@lcy01-03) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2) ) #68-Ubuntu SMP Wed Oct 26 19:39:52 UTC 2016 After upgrading kernel, my ZFS pool becomes DEGRADED: > zpool status pool: zp0 state: DEGRADED status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using 'zpool replace'. see: http://zfsonlinux.org/msg/ZFS-8000-4J scan: none requested config: NAME STATE READ WRITE CKSUM zp0 DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 nvme0n1 ONLINE 0 0 0 9486952355712335023 UNAVAIL 0 0 0 was /dev/nvme1n1 Only ONE controller listed: !! > nvme list Node SN Model Version Namespace Usage Format FW Rev ---------------- -------------------- ---------------------------------------- -------- --------- -------------------------- ---------------- -------- /dev/nvme0n1 CVMD4391006B800GGN INTEL SSDPE2ME800G4 1.0 1 800,17 GB / 800,17 GB 512 B + 0 B 8DV10102 The bug isn't fixed for me. [ 68.950042] nvme 0000:82:00.0: I/O 0 QID 0 timeout, disable controller [ 69.054149] nvme 0000:82:00.0: Cancelling I/O 0 QID 0 [ 69.054182] nvme 0000:82:00.0: Identify Controller failed (-4) [ 69.060132] nvme 0000:82:00.0: Removing after probe failure [ 69.060284] iounmap: bad address ffffc9000cf34000 [ 69.065020] CPU: 14 PID: 247 Comm: kworker/14:1 Tainted: P OE 4.4.0-47-generic #68-Ubuntu [ 69.065034] Hardware name: Supermicro SYS-F618R2-RC1+/X10DRFR-N, BIOS 2.0 01/27/2016 [ 69.065040] Workqueue: events nvme_remove_dead_ctrl_work [nvme] [ 69.065050] 0000000000000286 00000000e10d6171 ffff8820340efce0 ffffffff813f5aa3 [ 69.065052] ffff88203454b4f0 ffffc9000cf34000 ffff8820340efd00 ffffffff8106bdff [ 69.065054] ffff88203454b4f0 ffff88203454b658 ffff8820340efd10 ffffffff8106be3c [ 69.065056] Call Trace: [ 69.065068] [<ffffffff813f5aa3>] dump_stack+0x63/0x90 [ 69.065089] [<ffffffff8106bdff>] iounmap.part.1+0x7f/0x90 [ 69.065093] [<ffffffff8106be3c>] iounmap+0x2c/0x30 [ 69.065097] [<ffffffffc01c364a>] nvme_dev_unmap.isra.35+0x1a/0x30 [nvme] [ 69.065099] [<ffffffffc01c475e>] nvme_remove+0xce/0xe0 [nvme] [ 69.065108] [<ffffffff81447009>] pci_device_remove+0x39/0xc0 [ 69.065117] [<ffffffff815585e1>] __device_release_driver+0xa1/0x150 [ 69.065119] [<ffffffff815586b3>] device_release_driver+0x23/0x30 [ 69.065123] [<ffffffff8143fa7a>] pci_stop_bus_device+0x8a/0xa0 [ 69.065125] [<ffffffff8143fbca>] pci_stop_and_remove_bus_device_locked+0x1a/0x30 [ 69.065129] [<ffffffffc01c309c>] nvme_remove_dead_ctrl_work+0x3c/0x50 [nvme] [ 69.065136] [<ffffffff8109a4a5>] process_one_work+0x165/0x480 [ 69.065138] [<ffffffff8109a80b>] worker_thread+0x4b/0x4c0 [ 69.065141] [<ffffffff8109a7c0>] ? process_one_work+0x480/0x480 [ 69.065143] [<ffffffff8109a7c0>] ? process_one_work+0x480/0x480 [ 69.065147] [<ffffffff810a09e8>] kthread+0xd8/0xf0 [ 69.065150] [<ffffffff810a0910>] ? kthread_create_on_node+0x1e0/0x1e0 [ 69.065157] [<ffffffff8183538f>] ret_from_fork+0x3f/0x70 [ 69.065158] [<ffffffff810a0910>] ? kthread_create_on_node+0x1e0/0x1e0 [ 69.065161] Trying to free nonexistent resource <00000000fbd10000-00000000fbd13fff> -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1626894 Title: nvme drive probe failure Status in linux package in Ubuntu: Fix Released Status in linux source package in Xenial: Fix Released Status in linux source package in Yakkety: Fix Released Bug description: After upgrading from linux-image-4.4.0-38-generic to proposed update linux-image-4.4.0-39-generic, NVMe drives are no longer working. dmesg shows a probe failure. On the previous kernel version everything is working as expected. ----------------->%----------------- [ 1.005243] Hardware name: FUJITSU D3417-B1/D3417-B1, BIOS V5.0.0.11 R1.12.0.SR.2 for D3417-B1x 04/01/2016 [ 1.005349] Workqueue: events nvme_remove_dead_ctrl_work [nvme] [ 1.005484] 0000000000000286 00000000b6c91251 ffff880fe6e8bce0 ffffffff813f1f83 [ 1.005800] ffff880fe02150f0 ffffc90006a7c000 ffff880fe6e8bd00 ffffffff8106bdff [ 1.006117] ffff880fe02150f0 ffff880fe0215258 ffff880fe6e8bd10 ffffffff8106be3c [ 1.006433] Call Trace: [ 1.006509] [<ffffffff813f1f83>] dump_stack+0x63/0x90 [ 1.006589] [<ffffffff8106bdff>] iounmap.part.1+0x7f/0x90 [ 1.006668] [<ffffffff8106be3c>] iounmap+0x2c/0x30 [ 1.006770] [<ffffffffc007a64a>] nvme_dev_unmap.isra.35+0x1a/0x30 [nvme] [ 1.007048] [<ffffffffc007b75e>] nvme_remove+0xce/0xe0 [nvme] [ 1.007140] [<ffffffff81443409>] pci_device_remove+0x39/0xc0 [ 1.007220] [<ffffffff815549f1>] __device_release_driver+0xa1/0x150 [ 1.007301] [<ffffffff81554ac3>] device_release_driver+0x23/0x30 [ 1.007382] [<ffffffff8143be7a>] pci_stop_bus_device+0x8a/0xa0 [ 1.007462] [<ffffffff8143bfca>] pci_stop_and_remove_bus_device_locked+0x1a/0x30 [ 1.007559] [<ffffffffc007a09c>] nvme_remove_dead_ctrl_work+0x3c/0x50 [nvme] [ 1.007642] [<ffffffff8109a3e5>] process_one_work+0x165/0x480 [ 1.007722] [<ffffffff8109a74b>] worker_thread+0x4b/0x4c0 [ 1.007801] [<ffffffff8109a700>] ? process_one_work+0x480/0x480 [ 1.007881] [<ffffffff810a0928>] kthread+0xd8/0xf0 [ 1.007959] [<ffffffff810a0850>] ? kthread_create_on_node+0x1e0/0x1e0 [ 1.008041] [<ffffffff81831a8f>] ret_from_fork+0x3f/0x70 [ 1.008120] [<ffffffff810a0850>] ? kthread_create_on_node+0x1e0/0x1e0 [ 1.008222] Trying to free nonexistent resource <00000000f7100000-00000000f7103fff> [ 1.008276] genirq: Flags mismatch irq 0. 00000080 (nvme1q0) vs. 00015a00 (timer) [ 1.008281] Trying to free nonexistent resource <000000000000d000-000000000000d0ff> [ 1.008282] nvme 0000:02:00.0: Removing after probe failure [ 1.008645] Trying to free nonexistent resource <000000000000e000-000000000000e0ff> [ 1.027213] iounmap: bad address ffffc90006ae0000 [ 1.027456] CPU: 2 PID: 86 Comm: kworker/2:1 Not tainted 4.4.0-39-generic #59-Ubuntu -----------------%<----------------- To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1626894/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp