On Thu, 2015-06-11 at 23:03 +0100, David Woodhouse wrote: > On Thu, 2015-06-11 at 01:31 +0100, David Woodhouse wrote: > > On Tue, 2015-06-09 at 17:49 -0700, Eric Dumazet wrote: > > > > I've added some debugging, and it seems that when it deadlocks, > > > > glibc > > > > doesn't get *any* response to its RTM_GETADDR request. I know > > > > we'd get > > > > ENOBUFS is a *response* was dropped... but what about when the > > > > request > > > > itself is dropped? ... > > > > > > Please check that this patch fixes your issue : > > > > > > http://patchwork.ozlabs.org/patch/473041/ > > > > Looks likely; thanks. I'm running with that patch now. I haven't > > been > > able to quickly reproduce the problem on demand, but it usually > > happens > > within a day or two. So it'll be a few days at least before I call > > it a > > success. > > I just saw the same deadlock happen again; glibc's __check_pf() stuck > in recvmsg() waiting for a response that never comes. > > This is the Fedora 22 4.0.5 kernel with the above patch applied.
It did at least manage to survive a single night (which it often doesn't) if I also apply a version of this patch: https://patchwork.ozlabs.org/patch/473049/ Even on the known problematic kernels, I have been unable to reproduce this on demand using either my own threaded getaddrinfo() test program, or the one you posted here. -- David Woodhouse Open Source Technology Centre david.woodho...@intel.com Intel Corporation
smime.p7s
Description: S/MIME cryptographic signature