Jason Baron <jba...@akamai.com> writes: > On 09/30/2015 01:54 AM, Mathias Krause wrote: >> On 29 September 2015 at 21:09, Jason Baron <jba...@akamai.com> wrote: >>> However, if we call connect on socket 's', to connect to a new socket 'o2', >>> we >>> drop the reference on the original socket 'o'. Thus, we can now close socket >>> 'o' without unregistering from epoll. Then, when we either close the ep >>> or unregister 'o', we end up with this list corruption. Thus, this is not a >>> race per se, but can be triggered sequentially. >> >> Sounds profound, but the reproducers calls connect only once per >> socket. So there is no "connect to a new socket", no? >> But w/e, see below. > > Yes, but it can be reproduced this way too. It can also happen with a > close() on the remote peer 'o', and a send to 'o' from 's', which the > reproducer can do as pointed out Michal. The patch I sent deals with > both cases.
As Michal also pointed out, there's a unix_dgram_disconnected routine being called in both cases and insofar "deregistering" anything beyond what unix_dgram_disconnected (and - insofar I can tell this - unix_release_sock) already do is actually required, this would be the obvious place to add it. A good step on the way to that would be to write (and post) some test code which actually reproduces the problem in a predictable way. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html