Hey Neal, On Thu, Jan 09, 2025 at 04:27:10AM -0500, Neal P. Murphy wrote: > On Wed, 8 Jan 2025 22:15:48 +0100 > Uwe Kleine-König <u.kleine-koe...@baylibre.com> wrote: > > > Hello Neal, > > > > On Wed, Jan 01, 2025 at 11:18:37PM -0500, Neal Murphy wrote: > > > Package: src:linux > > > Version: 6.1.119-1 > > > Severity: critical > > > Justification: breaks the whole system > > > > > > Dear Maintainer, > > > > > > I plugged in my SSK NVME-to-USB3 adapter. I mounted it, checked it > > > (without > > > writing anything), and unmounted it. The system displayed the '... has > > > data to > > > be written ...' msg for quite a while. Around then, the system displayed > > > the > > > watchdog error on CPU 8. Shortly after, it displayed a watchdog error on > > > CPU 0 > > > and the system became unresponsive requiring a hard reset. > > > > > > When I got the SSK, it worked well on the desktop. Months later, I had > > > problems > > > with it, but didn't get any kernel oopses. The drive works OK on my Asus > > > laptop, so I'm beginning to suspect my desktop's hardware. > > > > > > I'm reporting this because flaky hardware usually shouldn't cause a system > > > lockup. > > > > This isn't only half of the truth. In an ideal world it would be true, > > but in reality this often doesn't work. > > > > There is another bugreport that looks quite similar to yours: > > https://lore.kernel.org/all/bug-219532-208...@https.bugzilla.kernel.org%2F/. > > The currently last message in that thread (from Dec 1, 22:07) has a > > patch. It would be great if you could test that and report upstream. > > > > Best regards > > Uwe > > Hmmm. It's definitely a hardware (mainboard) issue of some kind. > > Running Linux 6.11.5 from backports. > ------------------------------ > The device works fine plugged into a USB3.2 port in the back of the computer. > It will mount and umount rapidly many times. I can read many GiB of data from > it. I can write 10 GiB of data to it. I can let it sit idle for some minutes. > No errors appear in syslog. > > Plugged into one of the front USB3 ports, it works fine. For about a minute. > Then the system produces variations of the following: > ---- > 2025-01-09T03:20:26.514887-05:00 playground kernel: [596625.269156] sd > 9:0:0:0: [sdd] tag#18 uas_eh_abort_handler 0 uas-tag 1 inflight: CMD IN > 2025-01-09T03:20:26.514900-05:00 playground kernel: [596625.269479] sd > 9:0:0:0: [sdd] tag#18 CDB: Read(10) 28 00 00 00 00 00 00 00 01 00 > ---- > and more errors, finally unmounting and disconnecting the drive. The errors > occur whether or not I do anything with the drive (read, mount, read-write > files, unmount, etc.) > > If I plug the drive into a front port and do nothing with it, the errors > occur after about 30 seconds. > > Importantly, the system does *not* hang/crash when running 6.11.5; the errors > are handled well.
That's good news, thanks for your test. > Linux 6.1.0 > ----------- > As for Bookworm's 6.1 kernel, while I might have better luck > patching/building the 6.1.0-28 kernel (trying to build 6.11 from > backports was a Borg-ish experience), I would gladly run an xhci > module patched/built by someone familiar with the Debian build > methodology; it is alien territory for me. (Well, provided that the > patch noted above is easily applied to 6.1.) If it has lots of > debugging built in, even better. I tend to not work on fixing 6.1 here. Someone could try to identify the relevant changes between 6.1 and 6.11, but I guess that's a tidious work and in the end it's not a single commit that needs backporting but a whole bunch of commits. (That someone would probably have to be you, as you have access to that hardware.) So I suggest you just stick to the backport kernel until Debian 13. Best regards Uwe
signature.asc
Description: PGP signature