From: Soheil Hassas Yeganeh <[email protected]>
Date: Fri, 27 Apr 2018 14:57:32 -0400
> Since the socket lock is not held when calculating the size of
> receive queue, TCP_INQ is a hint. For example, it can overestimate
> the queue size by one byte, if FIN is received.
I think it is even worse than that.
If another application comes in and does a recvmsg() in parallel with
these calculations, you could even report a negative value.
These READ_ONCE() make it look like some of these issues are being
addressed but they are not.
You could freeze the values just by taking sk->sk_lock.slock, but I
don't know if that cost is considered acceptable or not.
Another idea is to sample both values in a loop, similar to a sequence
lock sequence:
again:
tmp1 = A;
tmp2 = B;
barrier();
tmp3 = A;
if (tmp1 != tmp3)
goto again;
But the current state of affairs is not going to work well.