On Wed, 8 Jan 2025 22:15:48 +0100 Uwe Kleine-König <u.kleine-koe...@baylibre.com> wrote:
> Hello Neal, > > On Wed, Jan 01, 2025 at 11:18:37PM -0500, Neal Murphy wrote: > > Package: src:linux > > Version: 6.1.119-1 > > Severity: critical > > Justification: breaks the whole system > > > > Dear Maintainer, > > > > I plugged in my SSK NVME-to-USB3 adapter. I mounted it, checked it (without > > writing anything), and unmounted it. The system displayed the '... has data > > to > > be written ...' msg for quite a while. Around then, the system displayed the > > watchdog error on CPU 8. Shortly after, it displayed a watchdog error on > > CPU 0 > > and the system became unresponsive requiring a hard reset. > > > > When I got the SSK, it worked well on the desktop. Months later, I had > > problems > > with it, but didn't get any kernel oopses. The drive works OK on my Asus > > laptop, so I'm beginning to suspect my desktop's hardware. > > > > I'm reporting this because flaky hardware usually shouldn't cause a system > > lockup. > > This isn't only half of the truth. In an ideal world it would be true, > but in reality this often doesn't work. > > There is another bugreport that looks quite similar to yours: > https://lore.kernel.org/all/bug-219532-208...@https.bugzilla.kernel.org%2F/. > The currently last message in that thread (from Dec 1, 22:07) has a > patch. It would be great if you could test that and report upstream. > > Best regards > Uwe Hmmm. It's definitely a hardware (mainboard) issue of some kind. Running Linux 6.11.5 from backports. ------------------------------ The device works fine plugged into a USB3.2 port in the back of the computer. It will mount and umount rapidly many times. I can read many GiB of data from it. I can write 10 GiB of data to it. I can let it sit idle for some minutes. No errors appear in syslog. Plugged into one of the front USB3 ports, it works fine. For about a minute. Then the system produces variations of the following: ---- 2025-01-09T03:20:26.514887-05:00 playground kernel: [596625.269156] sd 9:0:0:0: [sdd] tag#18 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN 2025-01-09T03:20:26.514900-05:00 playground kernel: [596625.269479] sd 9:0:0:0: [sdd] tag#18 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00 ---- and more errors, finally unmounting and disconnecting the drive. The errors occur whether or not I do anything with the drive (read, mount, read-write files, unmount, etc.) If I plug the drive into a front port and do nothing with it, the errors occur after about 30 seconds. Importantly, the system does *not* hang/crash when running 6.11.5; the errors are handled well. Linux 6.1.0 ----------- As for Bookworm's 6.1 kernel, while I might have better luck patching/building the 6.1.0-28 kernel (trying to build 6.11 from backports was a Borg-ish experience), I would gladly run an xhci module patched/built by someone familiar with the Debian build methodology; it is alien territory for me. (Well, provided that the patch noted above is easily applied to 6.1.) If it has lots of debugging built in, even better. Thanks, Neal