On Wed, 8 Jan 2025 22:15:48 +0100
Uwe Kleine-König <u.kleine-koe...@baylibre.com> wrote:

> Hello Neal,
> 
> On Wed, Jan 01, 2025 at 11:18:37PM -0500, Neal Murphy wrote:
> > Package: src:linux
> > Version: 6.1.119-1
> > Severity: critical
> > Justification: breaks the whole system
> > 
> > Dear Maintainer,
> > 
> > I plugged in my SSK NVME-to-USB3 adapter. I mounted it, checked it (without
> > writing anything), and unmounted it. The system displayed the '... has data 
> > to
> > be written ...' msg for quite a while. Around then, the system displayed the
> > watchdog error on CPU 8. Shortly after, it displayed a watchdog error on 
> > CPU 0
> > and the system became unresponsive requiring a hard reset.
> > 
> > When I got the SSK, it worked well on the desktop. Months later, I had 
> > problems
> > with it, but didn't get any kernel oopses. The drive works OK on my Asus
> > laptop, so I'm beginning to suspect my desktop's hardware.
> > 
> > I'm reporting this because flaky hardware usually shouldn't cause a system
> > lockup.  
> 
> This isn't only half of the truth. In an ideal world it would be true,
> but in reality this often doesn't work.
> 
> There is another bugreport that looks quite similar to yours:
> https://lore.kernel.org/all/bug-219532-208...@https.bugzilla.kernel.org%2F/.
> The currently last message in that thread (from Dec 1, 22:07) has a
> patch. It would be great if you could test that and report upstream.
> 
> Best regards
> Uwe

Hmmm. It's definitely a hardware (mainboard) issue of some kind.

Running Linux 6.11.5 from backports.
------------------------------
The device works fine plugged into a USB3.2 port in the back of the computer. 
It will mount and umount rapidly many times. I can read many GiB of data from 
it. I can write 10 GiB of data to it. I can let it sit idle for some minutes. 
No errors appear in syslog.

Plugged into one of the front USB3 ports, it works fine. For about a minute. 
Then the system produces variations of the following:
----
2025-01-09T03:20:26.514887-05:00 playground kernel: [596625.269156] sd 9:0:0:0: 
[sdd] tag#18 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN 
2025-01-09T03:20:26.514900-05:00 playground kernel: [596625.269479] sd 9:0:0:0: 
[sdd] tag#18 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00
----
and more errors, finally unmounting and disconnecting the drive. The errors 
occur whether or not I do anything with the drive (read, mount, read-write 
files, unmount, etc.)

If I plug the drive into a front port and do nothing with it, the errors occur 
after about 30 seconds.

Importantly, the system does *not* hang/crash when running 6.11.5; the errors 
are handled well.

Linux 6.1.0
-----------
As for Bookworm's 6.1 kernel, while I might have better luck patching/building 
the 6.1.0-28 kernel (trying to build 6.11 from backports was a Borg-ish 
experience), I would gladly run an xhci module patched/built by someone 
familiar with the Debian build methodology; it is alien territory for me. 
(Well, provided that the patch noted above is easily applied to 6.1.) If it has 
lots of debugging built in, even better.

Thanks,
Neal

Reply via email to