Hello folks,
In looking at a few benchmarks (especially netperf) run locally, it seems
that tcp is unable to make full use of available CPU cycles as the sender
is throttled waiting for ACKs to arrive. The problem is exacerbated when
the sender is using a small send buffer -- running netperf -C -c -- -s 1024
show a miserable 420Kbit/s at essentially 0% CPU usage. Tests over gige
are similarly constrained to a mere 96Mbit/s.
Since there is no way for the receiver to know if the sender is being
blocked on transmit space, would it not make sense for the receiver to
send out any delayed ACKs when it is clear that the receiving process is
waiting for more data? The patch below attempts this (I make no guarantees
of its correctness with respect to the rest of the delayed ack code). One
point I'm still contemplating is what to do if the receiver is waiting in
poll/select/epoll.
[All tests run with maxcpus=1 on a 2.67GHz Woodcrest system.]
Recv Send Send Utilization Service Demand
Socket Socket Message Elapsed Send Recv Send Recv
Size Size Size Time Throughput local remote local remote
bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB
Base (2.6.17-rc4):
default send buffer size
netperf -C -c
87380 16384 16384 10.02 14127.79 99.90 99.90 0.579 0.579
87380 16384 16384 10.02 13875.28 99.90 99.90 0.590 0.590
87380 16384 16384 10.01 13777.25 99.90 99.90 0.594 0.594
87380 16384 16384 10.02 13796.31 99.90 99.90 0.593 0.593
87380 16384 16384 10.01 13801.97 99.90 99.90 0.593 0.593
netperf -C -c -- -s 1024
87380 2048 2048 10.02 0.43 -0.04 -0.04 -7.105 -7.377
87380 2048 2048 10.02 0.43 -0.01 -0.01 -2.337 -2.620
87380 2048 2048 10.02 0.43 -0.03 -0.03 -5.683 -5.940
87380 2048 2048 10.02 0.43 -0.05 -0.05 -9.373 -9.625
87380 2048 2048 10.02 0.43 -0.05 -0.05 -9.373 -9.625
from a remote system over gigabit ethernet
netperf -H woody -C -c
87380 16384 16384 10.03 936.23 19.32 20.47 3.382 1.791
87380 16384 16384 10.03 936.27 17.67 20.95 3.091 1.833
87380 16384 16384 10.03 936.17 19.18 20.77 3.356 1.817
87380 16384 16384 10.03 936.26 18.22 20.26 3.188 1.773
87380 16384 16384 10.03 936.26 17.35 20.54 3.036 1.797
netperf -H woody -C -c -- -s 1024
87380 2048 2048 10.00 95.72 10.04 6.64 17.188 5.683
87380 2048 2048 10.00 95.94 9.47 6.42 16.170 5.478
87380 2048 2048 10.00 96.83 9.62 5.72 16.283 4.840
87380 2048 2048 10.00 95.91 9.58 6.13 16.368 5.236
87380 2048 2048 10.00 95.91 9.58 6.13 16.368 5.236
Patched:
default send buffer size
netperf -C -c
87380 16384 16384 10.01 13923.16 99.90 99.90 0.588 0.588
87380 16384 16384 10.01 13854.59 99.90 99.90 0.591 0.591
87380 16384 16384 10.02 13840.42 99.90 99.90 0.591 0.591
87380 16384 16384 10.01 13810.96 99.90 99.90 0.593 0.593
87380 16384 16384 10.01 13771.27 99.90 99.90 0.594 0.594
netperf -C -c -- -s 1024
87380 2048 2048 10.02 2473.48 99.90 99.90 3.309 3.309
87380 2048 2048 10.02 2421.46 99.90 99.90 3.380 3.380
87380 2048 2048 10.02 2288.07 99.90 99.90 3.577 3.577
87380 2048 2048 10.02 2405.41 99.90 99.90 3.402 3.402
87380 2048 2048 10.02 2284.41 99.90 99.90 3.582 3.582
netperf -H woody -C -c
87380 16384 16384 10.04 936.10 23.04 21.60 4.033 1.890
87380 16384 16384 10.03 936.20 18.52 21.06 3.242 1.843
87380 16384 16384 10.03 936.52 17.61 21.05 3.082 1.841
87380 16384 16384 10.03 936.18 18.24 20.73 3.191 1.814
87380 16384 16384 10.03 936.28 18.30 21.04 3.202 1.841
netperf -H woody -C -c -- -s 1024
87380 2048 2048 10.00 142.46 10.19 7.53 11.714 4.332
87380 2048 2048 10.00 147.28 9.73 7.93 10.829 4.412
87380 2048 2048 10.00 143.37 10.64 6.54 12.161 3.738
87380 2048 2048 10.00 146.41 9.18 7.43 10.277 4.158
87380 2048 2048 10.01 145.58 9.80 7.25 11.032 4.081
Comments/thoughts?
-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <[EMAIL PROTECTED]>.
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 934396b..e554ceb 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1277,8 +1277,11 @@ #endif
/* Do not sleep, just process backlog. */
release_sock(sk);
lock_sock(sk);
- } else
+ } else {
+ if (inet_csk_ack_scheduled(sk))
+ tcp_send_ack(sk);
sk_wait_data(sk, &timeo);
+ }
#ifdef CONFIG_NET_DMA
tp->ucopy.wakeup = 0;
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html