I'm trying to see how many connections I can write to in under 1 second. Has anybody ever tried a benchmark like this?
I've made a simple python handler that collects requests for a few seconds and then responds to all of them at once. This is not a test of raw hits/sec, but rather a test of delivery time to write to already open requests. With an EC2 m1.xlarge, I can get about 7000 deliveries in under a second in the best case. However, I'm not able to get this 100% of the time. Every once in awhile I'll see a lag of one to several seconds on some requests, probably due to dropped packets and TCP retransmissions. The lower the number of connections, the lower the probability of these lags. In tests with 7000 connections, lags on 25% of the writes are common. In tests with 3000, this reduces to 5%. Intentionally adding delays can help here. If I delay 0.5ms per 10 writes (from the handler, using time.sleep(0.0005)), then reliability increases. I can get 6000 deliveries in about 700-800ms this way, rather consistently, and if a lag does show up during a test then it might affect 2% of the connections. This is my first time ever really benchmarking this sort of thing, and I'm sure lot of factors play into how much throughput is possible. At least in my environment, my feeling is that the limit is at the network level. In other words, the ceilings I'm hitting here are likely not performance issues with the OS, Mongrel2, or even the fact that I'm using python as the handler language, since adding delays will improve performance. Anyway just thought I'd share. -- Justin Karneges Fanout, Inc. 530-220-7222
