On Sunday 28 September 2014 at 15:01:21 +0100, Ben Hutchings wrote: > On Sat, 2014-09-27 at 19:41 +0100, Mike Crowe wrote: > > I compiled my own version of the Debian 3.2.60-1+deb7u3 kernel with > > CONFIG_LOCKDEP and panic on hung task enabled. > > > > >From the crash dump: > > > > [25202.156175] INFO: task nfsd:3247 blocked for more than 900 seconds. > > [25202.162565] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > > this message. > > [25202.170432] nfsd D ffff88080aa0eca8 0 3247 2 > > 0x00000000 > > [25202.170444] ffff88080a8e19f0 0000000000000046 0000000000000006 > > ffff880800000000 > > [25202.170458] ffff88080aa0e9c0 ffff88080a8e1fd8 ffff88080a8e1fd8 > > 00000000001d4040 > > [25202.170472] ffff88040e9926c0 ffff88080aa0e9c0 ffffffff8138d6da > > 00000001a04c47dd > > [25202.170488] Call Trace: > > [25202.170504] [<ffffffff8138d6da>] ? __mutex_lock_common+0x236/0x379 > > [25202.170531] [<ffffffffa04c47dd>] ? fh_lock_nested+0x4d/0x61 [nfsd] > > [25202.170542] [<ffffffff8138cda2>] schedule+0x55/0x57 > > [25202.170552] [<ffffffff8138d6e7>] __mutex_lock_common+0x243/0x379 > > [25202.170569] [<ffffffffa04c47dd>] ? fh_lock_nested+0x4d/0x61 [nfsd] > > [25202.170581] [<ffffffff8138d8dc>] mutex_lock_nested+0x2a/0x31 > > [25202.170598] [<ffffffffa04c47dd>] fh_lock_nested+0x4d/0x61 [nfsd] > > [25202.170610] [<ffffffff810140f5>] ? sched_clock+0x9/0xd > > [25202.170626] [<ffffffffa04c50fe>] nfsd_lookup_dentry+0x196/0x227 [nfsd] > > [25202.170646] [<ffffffffa04cef7f>] nfsd4_secinfo.part.15+0x26/0x9e [nfsd] > > [25202.170666] [<ffffffffa04cf044>] nfsd4_secinfo+0x4d/0x5b [nfsd] > > [25202.170688] [<ffffffffa04ce105>] nfsd4_proc_compound+0x265/0x43e [nfsd] > > [25202.170703] [<ffffffffa04c181d>] nfsd_dispatch+0xe2/0x1c8 [nfsd] > > [25202.170734] [<ffffffffa03759c1>] svc_process_common+0x2cf/0x4d0 [sunrpc] > > [25202.170759] [<ffffffffa0375de0>] svc_process+0x118/0x136 [sunrpc] > > [25202.170773] [<ffffffffa04c10eb>] nfsd+0xeb/0x131 [nfsd] > > [25202.170796] [<ffffffffa04c1000>] ? 0xffffffffa04c0fff > > [25202.170806] [<ffffffff81065c75>] kthread+0xa3/0xab > > [25202.170815] [<ffffffff81396584>] kernel_thread_helper+0x4/0x10 > > [25202.170823] [<ffffffff8138f074>] ? retint_restore_args+0x13/0x13 > > [25202.170830] [<ffffffff81065bd2>] ? __init_kthread_worker+0x53/0x53 > > [25202.170837] [<ffffffff81396580>] ? gs_change+0x13/0x13 > > [25202.170842] 1 lock held by nfsd/3247: > > [25202.170845] #0: (&sb->s_type->i_mutex_key#13){+.+.+.}, at: > > [<ffffffffa04c47dd>] fh_lock_nested+0x4d/0x61 [nfsd] > > [25202.170870] Kernel panic - not syncing: hung_task: blocked tasks
[snip] > nfsd is trying to lock two objects in the same class: specifically, it > locks a file handle and then the file handle for the file's parent. > It's generally safe to do this so long as they're always taken in that > order. lockdep should complain (much more verbosely) if this is not > done consistently. That makes sense. So is there any clue as to why it's blocking inside the second mutex_lock_nested? > I'm afraid this doesn't explain what's going wrong. But if there are > any more messages from lockdep further up the log (like, 15 minutes > earlier), they might do. Unfortunately not, the previous line in the log is the last message from boot time: [ 38.624072] vnet0: no IPv6 routers present Is there a way I can persuade crash(8) to tell me which process currently has the lock in question? Do you have any advice as to any more debug stuff I should try turning on when compiling the kernel? Thanks for your help. Mike.
signature.asc
Description: Digital signature