A while ago I sent in an incomplete patch to the TCP memory allocation code to help it behave better when under memory stress. I sort of never had enough time to follow up and finish it, and it has grown very stale by this point. I was working on it in order to make a strong case for the default max buffer sizes to be increased. However, having thought about it a bit more, I'm not sure that this is necessary. (It would make the behavior under memory stress better, but this may not matter that much.)
With buffers being automatically tuned, the sum of the buffer memory will be bound by (a small multiple of) the aggregate bandwidth-delay product for all connections, that is, interface bandwidth times the weighted average of the RTT's. The means of attack is to inflate the RTT so that it consumes too much memory. For a receiver, defense is relatively easy because you can throw away unacknowledged data when you get into memory pressure. Linux already does this. For a sender, defense is more difficult because you can't throw away unacknowledged data. An attacker can consume 2*mss kernel memory per ack it sends, and hold on to it indefinitely. When the system gets into memory pressure, it can stop allocating new memory to that socket, but all other sockets will be starved for memory since it can't free memory from the attacked socket like a receiver can. This is kind of an ugly problem. The only way to deal with this situation is to kill the connection. This really bothered me until I had a conversation with Stanislav Shalunov, who described his netkill script to me (http://www.internet2.edu/~shalunov/netkill/). If, as an attacker, you're trying to exhaust a server's TCP sendbuf memory, you can usually consume more server memory per packet sent (probably ~10k) by just starting the connection, then moving on to a new one, as the netkill script does. Inflating cwnd is likely harder work for the attacker, consuming at most two packets worth of buffer (typically ~3k) for every ack generated. It may be necessary to have a good defense against such an attack, but I don't think a larger tcp_wmem[2] will hurt. There is a policy question of how much non-pageable memory you want to allow any single connection to consume. Usually the answer is "not too much." This is why net.core.wmem_max is a fairly small value. However, I think setting tcp_wmem[2] and tcp_rmem[2] up don't hurt so much, since a process can't just request that memory. The kernel has to allocate it in response to actual network events, and as I noted above, this is bound by the network's characteristics (bandwidth-delay product). Given the relatively widespread availability of 100 Mbps or greater connectivity on college campuses and larger companies, and the increasing availability of fiber to the home (especially in places like S. Korea and Japan), I'd really like to see the default buffer sizes increased significantly. I'm thinking something like 4 MB would be a good number. (This number should probably be automatically chosen at boot time, so that it's not more than ~1-2% of total system memory.) This would give 100 Mbps on a transcontinental link, but isn't an unreasonable amount of memory for most systems. Are there any major dangers in raising tcp_?mem[2] I haven't considered? Thanks, -John - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html