On 04/30/2018 08:38 AM, David Miller wrote: > From: Soheil Hassas Yeganeh <soheil.k...@gmail.com> > Date: Fri, 27 Apr 2018 14:57:32 -0400 > >> Since the socket lock is not held when calculating the size of >> receive queue, TCP_INQ is a hint. For example, it can overestimate >> the queue size by one byte, if FIN is received. > > I think it is even worse than that. > > If another application comes in and does a recvmsg() in parallel with > these calculations, you could even report a negative value. > > These READ_ONCE() make it look like some of these issues are being > addressed but they are not. > > You could freeze the values just by taking sk->sk_lock.slock, but I > don't know if that cost is considered acceptable or not. > > Another idea is to sample both values in a loop, similar to a sequence > lock sequence: > > again: > tmp1 = A; > tmp2 = B; > barrier(); > tmp3 = A; > if (tmp1 != tmp3) > goto again; > > But the current state of affairs is not going to work well. >
We want a hint, and max_t(int, 0, ....) does not return a negative value ? If the hint is wrong in 0.1 % of the cases, we really do not care, it is not meant to replace the existing precise ( well, sort of ) mechanism. I say sort of, because by the time we have any number, TCP might have received more packets anyway.