On 2025-05-06 08:00:34, Salvatore Bonaccorso wrote: > Hi Yu, > > Thanks for your followups. > > On Tue, May 06, 2025 at 09:25:50AM +0800, Yu Kuai wrote: >> Hi, >> >> 在 2025/05/06 4:59, Antoine Beaupré 写道: >> > On 2025-05-05 22:36:07, Salvatore Bonaccorso wrote: >> > > Hi Antoine, >> > > >> > > On Mon, May 05, 2025 at 02:50:32PM -0400, Antoine Beaupré wrote: >> > > > On 2025-05-05 18:02:37, Salvatore Bonaccorso wrote: >> > > > > On Mon, May 05, 2025 at 04:00:31PM +0200, Salvatore Bonaccorso wrote: >> > > > > > Hi Moritz, >> > > > > > >> > > > > > On Mon, May 05, 2025 at 01:47:15PM +0200, Moritz Mühlenhoff wrote: >> > > > > > > Am Wed, Apr 30, 2025 at 05:55:20PM +0200 schrieb Salvatore >> > > > > > > Bonaccorso: >> > > > > > > > Hi >> > > > > > > > >> > > > > > > > We got a regression report in Debian after the update from >> > > > > > > > 6.1.133 to >> > > > > > > > 6.1.135. Melvin is reporting that discard/trimm trhough a >> > > > > > > > RAID10 array >> > > > > > > > stalls idefintively. The full report is inlined below and >> > > > > > > > originates >> > > > > > > > from https://bugs.debian.org/1104460 . >> > > > > > > >> > > > > > > JFTR, we ran into the same problem with a few Wikimedia servers >> > > > > > > running >> > > > > > > 6.1.135 and RAID 10: The servers started to lock up once >> > > > > > > fstrim.service >> > > > > > > got started. Full oops messages are available at >> > > > > > > https://phabricator.wikimedia.org/P75746 >> > > > > > >> > > > > > Thanks for this aditional datapoints. Assuming you wont be able to >> > > > > > thest the other stable series where the commit d05af90d6218 >> > > > > > ("md/raid10: fix missing discard IO accounting") went in, might >> > > > > > you at >> > > > > > least be able to test the 6.1.y branch with the commit reverted >> > > > > > again >> > > > > > and manually trigger the issue? >> > > > > > >> > > > > > If needed I can provide a test Debian package of 6.1.135 (or >> > > > > > 6.1.137) >> > > > > > with the patch reverted. >> > > > > >> > > > > So one additional data point as several Debian users were reporting >> > > > > back beeing affected: One user did upgrade to 6.12.25 (where the >> > > > > commit was backported as well) and is not able to reproduce the issue >> > > > > there. >> > > > >> > > > That would be me. >> > > > >> > > > I can reproduce the issue as outlined by Moritz above fairly reliably >> > > > in >> > > > 6.1.135 (debian package 6.1.0-34-amd64). The reproducer is simple, on a >> > > > RAID-10 host: >> > > > >> > > > 1. reboot >> > > > 2. systemctl start fstrim.service >> > > > >> > > > We're tracking the issue internally in: >> > > > >> > > > https://gitlab.torproject.org/tpo/tpa/team/-/issues/42146 >> > > > >> > > > I've managed to workaround the issue by upgrading to the Debian package >> > > > from testing/unstable (6.12.25), as Salvatore indicated above. There, >> > > > fstrim doesn't cause any crash and completes successfully. In stable, >> > > > it >> > > > just hangs there forever. The kernel doesn't completely panic and the >> > > > machine is otherwise somewhat still functional: my existing SSH >> > > > connection keeps working, for example, but new ones fail. And an `apt >> > > > install` of another kernel hangs forever. >> > > >> > > So likely at least in 6.1.y there are missing pre-requisites causing >> > > the behaviour. >> > > >> > > If you can test 6.1.135-1 with the commit >> > > 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 reverted then you can fetch >> > > built packages at: >> > > >> > > https://people.debian.org/~carnil/tmp/linux/1104460/ >> >> Can you also test with 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 not >> reverted, and also cherry-pick c567c86b90d4715081adfe5eb812141a5b6b4883? > > Thank you. > > Antoine, Moritz, > https://people.debian.org/~carnil/tmp/linux/1104460-2/ contains a > build with 4a05f7ae33716d996c5ce56478a36a3ede1d76f2 *not* reverted and > with c567c86b90d4715081adfe5eb812141a5b6b4883 cherry-picked, can you > test this one as well?
I tested this one, and could succesfully run fstrim.service without problems. A. -- L'ennui avec la grande famille humaine, c'est que tout le monde veut en être le père. - Mafalda