A while ago I sent in an incomplete patch to the TCP memory allocation code
to help it behave better when under memory stress.  I sort of never had
enough time to follow up and finish it, and it has grown very stale by this
point.  I was working on it in order to make a strong case for the default
max buffer sizes to be increased.  However, having thought about it a bit
more, I'm not sure that this is necessary.  (It would make the behavior under 
memory stress better, but this may not matter that much.)

With buffers being automatically tuned, the sum of the buffer memory
will be bound by (a small multiple of) the aggregate bandwidth-delay
product for all connections, that is, interface bandwidth times the
weighted average of the RTT's.  The means of attack is to inflate the
RTT so that it consumes too much memory.

For a receiver, defense is relatively easy because you can throw away
unacknowledged data when you get into memory pressure.  Linux already
does this.

For a sender, defense is more difficult because you can't throw away
unacknowledged data.  An attacker can consume 2*mss kernel memory per ack it
sends, and hold on to it indefinitely.  When the system gets into memory
pressure, it can stop allocating new memory to that socket, but all other
sockets will be starved for memory since it can't free memory from the
attacked socket like a receiver can.  This is kind of an ugly problem.  The
only way to deal with this situation is to kill the connection.

This really bothered me until I had a conversation with Stanislav Shalunov,
who described his netkill script to me
(http://www.internet2.edu/~shalunov/netkill/).  If, as an attacker, you're
trying to exhaust a server's TCP sendbuf memory, you can usually consume
more server memory per packet sent (probably ~10k) by just starting the
connection, then moving on to a new one, as the netkill script does.
Inflating cwnd is likely harder work for the attacker, consuming at most
two packets worth of buffer (typically ~3k) for every ack generated.  It may
be necessary to have a good defense against such an attack, but I don't
think a larger tcp_wmem[2] will hurt.

There is a policy question of how much non-pageable memory you want to allow
any single connection to consume.  Usually the answer is "not too much."
This is why net.core.wmem_max is a fairly small value.  However, I think
setting tcp_wmem[2] and tcp_rmem[2] up don't hurt so much, since a process
can't just request that memory.  The kernel has to allocate it in response
to actual network events, and as I noted above, this is bound by the
network's characteristics (bandwidth-delay product).

Given the relatively widespread availability of 100 Mbps or greater
connectivity on college campuses and larger companies, and the increasing
availability of fiber to the home (especially in places like S. Korea and
Japan), I'd really like to see the default buffer sizes increased
significantly.  I'm thinking something like 4 MB would be a good number.
(This number should probably be automatically chosen at boot time, so that
it's not more than ~1-2% of total system memory.) This would give 100 Mbps on
a transcontinental link, but isn't an unreasonable amount of memory for most
systems.

Are there any major dangers in raising tcp_?mem[2] I haven't considered?

Thanks,
  -John
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to