On 20.03.2018 22:25, Kirill Tkhai wrote: > Hi, David, > > thanks for the review! > > On 20.03.2018 19:23, David Miller wrote: >> From: Kirill Tkhai <ktk...@virtuozzo.com> >> Date: Mon, 19 Mar 2018 12:14:54 +0300 >> >>> This reverts commit 1215e51edad1. >>> Since raw_close() is used on every RAW socket destruction, >>> the changes made by 1215e51edad1 scale sadly. This clearly >>> seen on endless unshare(CLONE_NEWNET) test, and cleanup_net() >>> kwork spends a lot of time waiting for rtnl_lock() introduced >>> by this commit. >>> >>> Next patches in series will rework this in another way, >>> so now we revert 1215e51edad1. Also, it doesn't seen >>> mrtsock_destruct() takes sk_lock, and the comment to the commit >>> does not show the actual stack dump. So, there is a question >>> did we really need in it. >>> >>> Signed-off-by: Kirill Tkhai <ktk...@virtuozzo.com> >> >> Kirill, I think the commit you are reverting is legitimate. >> >> The IP_RAW_CONTROL path has an ABBA deadlock with other paths once >> you revert this, so you are reintroducing a bug. > > The talk is about IP_ROUTER_ALERT, I assume there is just an erratum. > >> All code paths that must take both RTNL and the socket lock must >> do them in the same order. And that order is RTNL then socket >> lock. > > The place I change in this patch is IP_ROUTER_ALERT. There is only > a call of ip_ra_control(), while this function does not need socket > lock. Please, see next patch. It moves this ip_ra_control() out > of socket lock. And it fixes the problem pointed in reverted patch > in another way. So, if there is ABBA, after next patch it becomes > solved. Does this mean I have to merge [2/5] and [3/5] together?
We also can just change the order of patches, and make [3/5] go before [2/5]. Then, the kernel still remains bisectable. How do you think about this? Thanks, Kirill >> But you are breaking that here by getting us back into a state >> where IP_RAW_CONTROL setsockopt will take the socket lock and >> then RTNL. >> >> Again, we can't take, or retake, RTNL if we have the socket lock >> currently. >> >> The only valid locking order is socket lock then RTNL. > > Thanks, > Kirill >