Hello,

Mike Kelly, le mer. 26 nov. 2025 13:11:00 +0000, a ecrit:
> I commonly see console reports of the following kind whilst performing 
> stress-ng tests that exercise swapping heavily:
> 
> [ 4302.5600050] wd0d: device timeout reading fsbn 216912 of 216912-216919 
> (wd0 bn 216912; cn 215 tn 3 sn 3), xfer f28, retry 2
> [ 4302.5600050] wd0d: device timeout reading fsbn 216792 of 216792-216799 
> (wd0 bn 216792; cn 215 tn 1 sn 9), xfer b18, retry 2
> [ 4302.5600050] wd0d: device timeout reading fsbn 7883720 of 7883720-7883727 
> (wd0 bn 7883720; cn 7821 tn 2 sn 26), xfer d88, retry 2
> [ 4302.5600050] wd0d: device timeout reading fsbn 216576 of 216576-216583 
> (wd0 bn 216576; cn 214 tn 13 sn 45), xfer df0, retry 2
> 
> I have finally noticed the rather obvious coincident state:
> 
> db>
>     TASK        THREADS
>   0 gnumach (f597aee0): 8 threads:
>               0 pageout (f5978e60) .WS.N.P 0xc1125d04
>               1 idle/0 (f5978cf0) R......
>               2 (f5978b80) .W.ON..(reaper_thread_continue) 0xc1131050
>               3 (f5978a10) .W.ON.P(swapin_thread_continue) 0xc1131180
>               4 (f59788a0) .W.ON..(sched_thread_continue) 0
>               5 (f5978730) .W..N.. 0xc11258c4
>               6 (f59785c0) .W.ON.P(io_done_thread_continue) 0xc1131740
>               7 (f5978450) .W.ON..(net_thread_continue) 0xc1127914
> db> trace /tu $task0.5
> switch_context(f5978730,0,f5b43e68,c1039f86,c112ea08)+0xa6
> thread_invoke(f5978730,0,f5b43e68,c103aa65)+0xcf
> thread_block(0,0,3,c1039d5c)+0x40
> vm_page_wait(0,f5978730,7,f596b948)+0x5a
> kmem_cache_alloc(c1124dc0,0,46,f5978730)+0x10a
> kalloc.part.0(0,f596dec0,f5970fc8,c10471bc,30)+0xbe
> kalloc(30,0,0,c104722e)+0x15
> intr_thread(0,c1039ec0,0,c104bd99,f59788a0)+0xdc
> thread_continue(f59788a0,f5977a00,f597ac40,f5970e00,f5970e38)+0x2a
> Thread_continue()
> 
> The interrupt delivery thread is blocked on page_wait which prevents the 
> device interrupt being delivered to rumpdisk. No disc reads/writes occur 
> thereafter which prevents pageout reclaiming free pages and system lock 
> occurs. The associated patch adds a very low risk solution to permit 
> vm_privilege to this thread.

Applied, thanks!

Samuel

Reply via email to