Public bug reported:

SRU Justification

Impact: WARN_ON messages casued by race condition between the close of
a TCP socket and another process  inspecting the same socket.

The code of interest is the following; in tcp_close function :
...
release_sock(sk);
...
WARN_ON(sock_owned_by_user(sk));
...

While in release_sock(sk), sock_release_owner function is called which sets the 
sk->sk_lock.owned=0.
When WARN_ON(sock_owned_by_user(sk)) is called it expects to find that the 
socket is not
owned by anyone.
According to upstream commit 8873c064d1de579ea2341,
while a socket is being closed is possible that other threads find it in 
rtnetlink dump.
tcp_get_info() function acquires the socket lock ( and sets sk_lock.owned=1 ) 
for 
a short amount of time, however long enough to trigger this warning.


Fix: 
Fixed by upstream commit in v4.20:
Commit: 8873c064d1de579ea23412a6d3eee972593f142b
"tcp: do not release socket ownership in tcp_close()"

Commit 8873c064d1de579ea23412a6d3eee972 fixes this bug by delegating the 
release of ownership
(calling release_sock(sk)) to later; just before exiting tcp_close function.
 

Testcase:
Reporter has tested and verified test 4.15 test kernel for Bionic.
This bug is difficult to be reproduced locally because the race condition 
cannot 
be triggered in a deterministic way.
To hit this bug we need the following :
a) a process closing a socket and while the execution is between release_sock(s)
and WARN_ON(sock_owned_by_user(sk))
b) another process inspecting the same socket to get into tcp_get_info(), 
acquire
ownership of the socket and not release it until the first process reaches the
WARN_ON(sock_owned_by_user(sk)).

This scenario is difficult to be achieved in a testing environment.


Regression Potential:
As far as Bionic (4.15 kernel) is concerned the reporter of the bug has tested 
and
verified a test kernel with the fix.
Concerning Cosmic (4.18 kernel) the fix has not been tested.
However, given that 
a) this fix essentially removes the WARN_ON(sock_owned_by_user(sk))
and delegates the release of the ownership to later in the tcp_close function, 
and
b) the relevant code paths in 4.15 and 4.18 are largely the same
the regression potential should be minimal.

** Affects: linux (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: linux (Ubuntu Bionic)
     Importance: Undecided
         Status: New

** Affects: linux (Ubuntu Cosmic)
     Importance: Undecided
         Status: New

** Also affects: linux (Ubuntu Cosmic)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu Bionic)
   Importance: Undecided
       Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1830813

Title:
  TCP : race condition on socket ownership in tcp_close()

Status in linux package in Ubuntu:
  New
Status in linux source package in Bionic:
  New
Status in linux source package in Cosmic:
  New

Bug description:
  SRU Justification

  Impact: WARN_ON messages casued by race condition between the close of
  a TCP socket and another process  inspecting the same socket.

  The code of interest is the following; in tcp_close function :
  ...
  release_sock(sk);
  ...
  WARN_ON(sock_owned_by_user(sk));
  ...

  While in release_sock(sk), sock_release_owner function is called which sets 
the 
  sk->sk_lock.owned=0.
  When WARN_ON(sock_owned_by_user(sk)) is called it expects to find that the 
socket is not
  owned by anyone.
  According to upstream commit 8873c064d1de579ea2341,
  while a socket is being closed is possible that other threads find it in 
rtnetlink dump.
  tcp_get_info() function acquires the socket lock ( and sets sk_lock.owned=1 ) 
for 
  a short amount of time, however long enough to trigger this warning.

  
  Fix: 
  Fixed by upstream commit in v4.20:
  Commit: 8873c064d1de579ea23412a6d3eee972593f142b
  "tcp: do not release socket ownership in tcp_close()"

  Commit 8873c064d1de579ea23412a6d3eee972 fixes this bug by delegating the 
release of ownership
  (calling release_sock(sk)) to later; just before exiting tcp_close function.
   

  Testcase:
  Reporter has tested and verified test 4.15 test kernel for Bionic.
  This bug is difficult to be reproduced locally because the race condition 
cannot 
  be triggered in a deterministic way.
  To hit this bug we need the following :
  a) a process closing a socket and while the execution is between 
release_sock(s)
  and WARN_ON(sock_owned_by_user(sk))
  b) another process inspecting the same socket to get into tcp_get_info(), 
acquire
  ownership of the socket and not release it until the first process reaches the
  WARN_ON(sock_owned_by_user(sk)).

  This scenario is difficult to be achieved in a testing environment.

  
  Regression Potential:
  As far as Bionic (4.15 kernel) is concerned the reporter of the bug has 
tested and
  verified a test kernel with the fix.
  Concerning Cosmic (4.18 kernel) the fix has not been tested.
  However, given that 
  a) this fix essentially removes the WARN_ON(sock_owned_by_user(sk))
  and delegates the release of the ownership to later in the tcp_close 
function, and
  b) the relevant code paths in 4.15 and 4.18 are largely the same
  the regression potential should be minimal.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1830813/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to