Hi all.

We've been doing some rather extreme JBoss/Tomcat stress-testing on both Windows and Linux, APR and non-APR. We played with maxThreads and acceptCount and tried to make sense of the various behaviors that we saw.
In a nutshell, we discovered that under Windows, the maximum size of the 
backLog on a listening socket is hard limited to 200 and there is 
nothing you can do to overcome this at the O/S level. This has some 
implications for high-load (peaky) situations where a flood of new 
connection requests hits the server all at once. (Let's put to one side 
the SYN flood DoS protection that the O/S might also have that might get 
in the way of things).
Now unfortunately, APR does not appear to help alleviate this 
problem...you can have thousands of open connections, all idling away 
with CPU at zero, and still have lots of new clients turned away at the 
door if there is a burst of them.
I dug into the AprEndpoint code to see how it worked. It would seem that 
when a new connection is accepted, it requires a thread from the worker 
thread pool to help process the new connection. This is the same pool of 
threads that is servicing requests from the existing connection pool. If 
there is a reasonable amount of activity with all the existing 
connections, contention for these threads will be very high, and the 
rate at which the new connection requests can be serviced is therefore 
quite low. Thus with a burst of new connection requests, the backlog 
queue fills quickly and (under Windows) the ones that don't make it to 
the queue get their connection unceremoniously and immediately reset. 
(Which is rather ugly in itself since it does not allow for tcp recovery 
attempts on the connection request over the next 20 secs or so, as 
happens under Linux).
So what I was wondering is whether the acceptor could adopt a slightly 
different strategy. Firstly, can it not use a separate (perhaps 
internal) small pool of threads for handling new connection requests, 
and in addition, hand them straight over to the poller without 
processing the first request that might be associated with it. [I think 
that the relatively new connector setting deferAccept might have some 
relevance here?] Basically the idea would be to simply get the new 
connections out of the backlog as quickly as possible and over to the 
poller to deal with. This would mean that the listen backlog is likely 
to be kept to an absolute minimum even in flooding situations. (Until of 
course, you hit the other limit of how many sockets the poller can handle).
Be interested in your thoughts.

Cheers,
MT


Reply via email to