On Thu, 2015-06-11 at 23:03 +0100, David Woodhouse wrote:
> On Thu, 2015-06-11 at 01:31 +0100, David Woodhouse wrote:
> > On Tue, 2015-06-09 at 17:49 -0700, Eric Dumazet wrote:
> > > > I've added some debugging, and it seems that when it deadlocks, 
> > > > glibc
> > > > doesn't get *any* response to its RTM_GETADDR request. I know 
> > > > we'd get
> > > > ENOBUFS is a *response* was dropped... but what about when the 
> > > > request
> > > > itself is dropped? ... 
> > > 
> > > Please check that this patch fixes your issue :
> > > 
> > > http://patchwork.ozlabs.org/patch/473041/
> > 
> > Looks likely; thanks. I'm running with that patch now. I haven't 
> > been
> > able to quickly reproduce the problem on demand, but it usually 
> > happens
> > within a day or two. So it'll be a few days at least before I call 
> > it a
> > success.
> 
> I just saw the same deadlock happen again; glibc's __check_pf() stuck
> in recvmsg() waiting for a response that never comes.
> 
> This is the Fedora 22 4.0.5 kernel with the above patch applied.

It did at least manage to survive a single night (which it often
doesn't) if I also apply a version of this patch:
https://patchwork.ozlabs.org/patch/473049/

Even on the known problematic kernels, I have been unable to reproduce
this on demand using either my own threaded getaddrinfo() test program,
or the one you posted here.

-- 
David Woodhouse                            Open Source Technology Centre
david.woodho...@intel.com                              Intel Corporation

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to