[Bug 1730746] Re: Processes hang on attempted access of WDC WD30-EZRX 3TB HDD

Bug Watch Updater Tue, 08 Oct 2019 08:52:19 -0700

Launchpad has imported 25 comments from the remote bug at
https://bugzilla.kernel.org/show_bug.cgi?id=197875.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2017-11-14T22:06:16+00:00 chuck.burt+kernel.org wrote:

When booted under kernel 4.13.x, processes (such as parted, boot-info,
gparted, etc) always hang when attempting to run due to a Western
Digital Green WD30-EZRX 3TB HDD. Drive works as expected under kernel
4.10.x (even when all other things are the same about the system but
only booted kernel is different).

Full specs of the machine if useful: https://www.support.hp.com/id-
en/document/c03277050

More details on Ubuntu's LaunchPad (where they asked me to come here to
file an upstream bug):
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/6

------------------------------------------------------------------------
On 2017-11-14T22:26:27+00:00 bvanassche wrote:

Please provide the output of the following command after having
reproduced the hang:

    dmesg -c >/dev/null; echo w > /proc/sysrq-trigger; dmesg

Additionally, if you know how to build the kernel yourself, it would be
helpful if you could bisect this issue. Documentation is available e.g.
at https://git-scm.com/docs/git-bisect.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/8

------------------------------------------------------------------------
On 2017-11-29T14:07:59+00:00 chuck.burt+kernel.org wrote:

Created attachment 260925
Output of requested command

I reproduced the hang and ran the command as requested.  See attached
file output-20171129.txt

Building the kernel is something I could attempt tackling, but as a
newbie I'm highly likely to mess something up.  Either way, it will be a
few weeks before I can get to it (best case).  So I _really_ hope this
provides the clue needed!

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/9

------------------------------------------------------------------------
On 2017-11-29T14:14:44+00:00 chuck.burt+kernel.org wrote:

Created attachment 260927
Output of requested command as su

After reading the first few lines of the last attachment, it occurred to
me that running this command as su might be useful.  See attached.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/10

------------------------------------------------------------------------
On 2017-11-29T14:34:15+00:00 chuck.burt+kernel.org wrote:

Created attachment 260929
Output of requested command as su

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/11

------------------------------------------------------------------------
On 2017-11-29T16:36:47+00:00 bvanassche wrote:

So command processing got stuck. Since there are two code paths in
recent kernels we need to know whether or not scsi-mq was used. Hence
please provide the output of the following command:

for d in /sys/block/*; do sfx=""; [ -e "$d/mq" ] && sfx=" [mq]"; echo
"$d$sfx"; done

If the above command reports that scsi-mq is being used for the WDC
disk, please check whether the following command resolves the lockup:

for d in /sys/kernel/debug/block/*/state; do echo kick >$d; done

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/12

------------------------------------------------------------------------
On 2017-11-29T20:01:10+00:00 chuck.burt+kernel.org wrote:

> # for d in /sys/block/*; do sfx=""; [ -e "$d/mq" ] && sfx=" [mq]"; echo
> "$d$sfx"; done
> /sys/block/loop0 [mq]
> /sys/block/loop1 [mq]
> /sys/block/loop2 [mq]
> /sys/block/loop3 [mq]
> /sys/block/loop4 [mq]
> /sys/block/loop5 [mq]
> /sys/block/loop6 [mq]
> /sys/block/loop7 [mq]
> /sys/block/sda
> /sys/block/sdb
> /sys/block/sdc
> /sys/block/sr0

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/13

------------------------------------------------------------------------
On 2017-11-29T20:16:11+00:00 bvanassche wrote:

That's weird, there are no known queue lockup bugs in the legacy
block/SCSI core layers. Is the WDC harddisk perhaps controlled by a HBA?
Can you provide the output of lspci (run as root)?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/14

------------------------------------------------------------------------
On 2017-11-30T16:04:40+00:00 chuck.burt+kernel.org wrote:

Created attachment 260953
Output of lspci on 4.10.x kernel

I won't be able to boot into the newer kernel for about a week, however
since `lspci` is hardware-oriented, sharing the output under the older
kernel in case it's helpful.  Please let me know if you want me to run
it on the new one instead and I'll get it when I can.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/15

------------------------------------------------------------------------
On 2017-12-07T21:40:24+00:00 bvanassche wrote:

My hope was that the list of PCI devices would show a PCI HBA of which
the driver has been modified recently. Since that's not the case I'm out
of ideas about what could be the root cause of this bug. Unless someone
else has an idea about how to find the root cause of this issue I think
your only option is to perform a bisect of the Linux kernel.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/16

------------------------------------------------------------------------
On 2018-03-23T14:00:20+00:00 chuck.burt+kernel.org wrote:

Created attachment 274895
Git Bisect Log 1 - 20180323

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/17

------------------------------------------------------------------------
On 2018-03-23T14:00:44+00:00 chuck.burt+kernel.org wrote:

Created attachment 274897
Git Bisect Log 2 - 20180323

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/18

------------------------------------------------------------------------
On 2018-03-23T14:04:37+00:00 chuck.burt+kernel.org wrote:

I finally got around to bisecting.

I had to do it twice as I identified two issues here.

Git Bisect Log 1 - https://bugzilla.kernel.org/attachment.cgi?id=274895
This identifies the commit where processes would full on hang as a result of 
the drive being connected.

Git Bisect Log 2 - https://bugzilla.kernel.org/attachment.cgi?id=274897
This identifies a separate issue (should I file a separate bug for this?) where 
mounting/unmounting caused error:

> Device /dev/sdb3 is already mounted at `/media/temp/[identifier]`. 
> (udisks-error-quark, 6)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/19

------------------------------------------------------------------------
On 2018-03-23T16:06:33+00:00 bvanassche wrote:

Thanks for having run a bisect, that really helps.

Recently the following commit went upstream:

commit c9f926000fe3b84135a81602a9f7e63a6a7898e2 (mkp-scsi/4.15/scsi-fixes)
Author: Hannes Reinecke <h...@suse.de>
Date:   Wed Jan 10 09:34:02 2018 +0100

    scsi: libsas: Disable asynchronous aborts for SATA devices

    Handling CD-ROM devices from libsas is decidedly odd, as libata relies
    on SCSI EH to be started to figure out that no medium is present.  So we
    cannot do asynchronous aborts for SATA devices.

    Fixes: 909657615d9 ("scsi: libsas: allow async aborts")
    Cc: <sta...@vger.kernel.org> # 4.12+
    Signed-off-by: Hannes Reinecke <h...@suse.com>
    Reviewed-by: Christoph Hellwig <h...@lst.de>
    Tested-by: Yves-Alexis Perez <cor...@debian.org>
    Signed-off-by: Martin K. Petersen <martin.peter...@oracle.com>

So you may want to try one of the kernel versions that includes that
fix, e.g. v4.14.15 or v4.15.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/20

------------------------------------------------------------------------
On 2018-03-23T19:48:29+00:00 chuck.burt+kernel.org wrote:

I tested with v4.15.0-041500.  Great news, the hang is resolved!

The second issue I found still exists, but that is not nearly as severe
(it doesn't block my usage).  It also occurs on more drives.  Should I
break that into a separate issue?

Thank you very very much for your help, Bart.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/21

------------------------------------------------------------------------
On 2018-03-23T19:56:53+00:00 bvanassche wrote:

Sorry but I lost track. What was the second issue?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/22

------------------------------------------------------------------------
On 2018-03-23T20:35:23+00:00 chuck.burt+kernel.org wrote:

Git Bisect Log 2 - https://bugzilla.kernel.org/attachment.cgi?id=274897
This identifies a separate issue (should I file a separate bug for this?) where 
mounting/unmounting caused error:

> Device /dev/sdb3 is already mounted at `/media/temp/[identifier]`. 
> (udisks-error-quark, 6)

Unmounting gives a similar error about being unable to unmount (I can
provide the exact error in a bit if you need it).

This mounting/unmounting error still exists in the v4.15 kernel and was 
introduced in the commit isolated in the above bisect (Git Bisect Log 2).

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/23

------------------------------------------------------------------------
On 2018-03-23T20:48:26+00:00 bvanassche wrote:

At the end of bisect log 2 I found the following:

first bad commit: [8d65b08debc7e62b2c6032d7fe7389d895b92cbc] Merge
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next

It seems unlikely to me that any of the commits in the networking tree
would cause mounting of a local filesystem to fail.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/24

------------------------------------------------------------------------
On 2018-03-23T20:56:02+00:00 chuck.burt+kernel.org wrote:

Yet it appears there were numerous revisions in the `drivers/scsi`
area...?

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-
next.git/commit/?id=8d65b08debc7e62b2c6032d7fe7389d895b92cbc

I'm a newbie, so... I could obviously be reading this completely
wrong...

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/25

------------------------------------------------------------------------
On 2018-03-23T20:57:16+00:00 chuck.burt+kernel.org wrote:

(Also, to clarify, the mounting does not actually fail... it produces
that error as a dialog in the GUI, but mounting does actually succeed.)

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/26

------------------------------------------------------------------------
On 2018-03-23T21:03:18+00:00 bvanassche wrote:

As far as I can see merging Dave's tree pulled in only the following three SCSI 
changes:
* qed*: Utilize Firmware 8.15.3.0
* qedf: fix wrong le16 conversion
* netlink: extended ACK reporting

Unless you are using the qedi or qedf driver I think that's it's
unlikely that these changes are related to the issue you reported.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/27

------------------------------------------------------------------------
On 2018-03-23T21:32:05+00:00 chuck.burt+kernel.org wrote:

Thank you again.  Should we close this issue as duplicate / resolves
elsewhere?

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/28

------------------------------------------------------------------------
On 2018-03-23T21:35:04+00:00 bvanassche wrote:

This ticket has category IO/Storage; SCSI. That category does not cover
mounting filesystems. I'm fine with closing this ticket and creating a
new ticket if for the mount issue if necessary.

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/29

------------------------------------------------------------------------
On 2019-10-08T09:38:40+00:00 jigneshsharm601 wrote:

whatever you have written in you blog is knowledgeable and any one can easily 
understand but if you don’t mind I would like to tell everyone if anyone Having 
problems Like hp printer showing offline then don’t be panic we are here to 
support you and solve all your issues of HP printers.
for more information visit our website.
https://www.800customersupport24x7.com/hp-printer-support/

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/31

------------------------------------------------------------------------
On 2019-10-08T11:39:43+00:00 jerrysmith3592 wrote:

All the blog which is provided by you is having valuable and useful
content. Many bloggers learn many things from you and enhance their
writting skills. As I also write blogs and in that blogs we provide
information related to Canon Printer and also provide services to
resolve problems of Canon Printer.  If you have any query or need any
help you can use canon printer customer support or can contact to our
experts or you can visit our site:-
https://www.800customersupport24x7.com/canon-printer-support/

Reply at:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1730746/comments/32

** Changed in: linux
       Status: Unknown => Confirmed

** Changed in: linux
   Importance: Unknown => Medium

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1730746

Title:
  Processes hang on attempted access of WDC WD30-EZRX 3TB HDD

To manage notifications about this bug go to:
https://bugs.launchpad.net/linux/+bug/1730746/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1730746] Re: Processes hang on attempted access of WDC WD30-EZRX 3TB HDD

Reply via email to