Hello, (cc'ing Steven, Sergey and Petr who are working on printk)
On Tue, Jan 23, 2018 at 02:03:57PM -0500, Rik van Riel wrote: > On Mon, 2018-01-22 at 14:00 -0800, Tejun Heo wrote: > > debug_show_all_locks() iterates all tasks and print held locks whole > > holding tasklist_lock. This can take a while on a slow console > > device > > and may end up triggering NMI hardlockup detector if someone else > > ends > > up waiting for tasklist_lock. > > > > Touch the NMI watchdog while printing the held locks to avoid > > spuriously triggering the hardlockup detector. > > > > Signed-off-by: Tejun Heo <[email protected]> > > On this patch: > > Acked-by: Rik van Riel <[email protected]> > > > However, it seems like we run into things like > this on a fairly regular (though not very frequent) > basis. Would it make sense to go through the code > and add sprinkle around a few more touch_nmi_watchdog() > calls? > > After all, there are maybe a few dozen places where > we print out a lot of debugging information. Yeah, it's ridiculous how often printk ends up escalating otherwise recoverable situations into system crashes. I don't know what the right answer is. For spurious NMI hardlockups, maybe auditing debug paths and adding touch_nmi_watchdog() would be enough but that also is a pretty leaky approach. Thanks. -- tejun

