Tomcat Native: APR Http Endpoint handler behavior under Windows
Hi all. We've been doing some rather extreme JBoss/Tomcat stress-testing on both Windows and Linux, APR and non-APR. We played with maxThreads and acceptCount and tried to make sense of the various behaviors that we saw. In a nutshell, we discovered that under Windows, the maximum size of the backLog on a listening socket is hard limited to 200 and there is nothing you can do to overcome this at the O/S level. This has some implications for high-load (peaky) situations where a flood of new connection requests hits the server all at once. (Let's put to one side the SYN flood DoS protection that the O/S might also have that might get in the way of things). Now unfortunately, APR does not appear to help alleviate this problem...you can have thousands of open connections, all idling away with CPU at zero, and still have lots of new clients turned away at the door if there is a burst of them. I dug into the AprEndpoint code to see how it worked. It would seem that when a new connection is accepted, it requires a thread from the worker thread pool to help process the new connection. This is the same pool of threads that is servicing requests from the existing connection pool. If there is a reasonable amount of activity with all the existing connections, contention for these threads will be very high, and the rate at which the new connection requests can be serviced is therefore quite low. Thus with a burst of new connection requests, the backlog queue fills quickly and (under Windows) the ones that don't make it to the queue get their connection unceremoniously and immediately reset. (Which is rather ugly in itself since it does not allow for tcp recovery attempts on the connection request over the next 20 secs or so, as happens under Linux). So what I was wondering is whether the acceptor could adopt a slightly different strategy. Firstly, can it not use a separate (perhaps internal) small pool of threads for handling new connection requests, and in addition, hand them straight over to the poller without processing the first request that might be associated with it. [I think that the relatively new connector setting deferAccept might have some relevance here?] Basically the idea would be to simply get the new connections out of the backlog as quickly as possible and over to the poller to deal with. This would mean that the listen backlog is likely to be kept to an absolute minimum even in flooding situations. (Until of course, you hit the other limit of how many sockets the poller can handle). Be interested in your thoughts. Cheers, MT
Re: Tomcat Native: APR Http Endpoint handler behavior under Windows
A few followups on this. It's only now that I have realised that the the tomcat apr handler in JBoss is subtly different from the one in Tomcatie that whole piece of tomcat has been forked into JBossWeb, and they're starting to diverge. Be that as it may, my comments cover the current design as it exists in both of them at trunk level. I think with Windows, APR will have scalability problems sooner or later, as poller performance is bad on that platform (the code has a hack to use many pollers as performance degrades quickly with size). There is a Vista+ solution to that somewhere in the future, but I'm not sure this whole thing will still be relevant then. Why is poller performance bad in Windows? Is that a consequence of the way the APR interfaces to WinSock? I'm guessing that APR uses a Unix-style approach to polling the sockets. Or is it to do with the performance of the poll inside Window itself? Be that as it may, at our end we still have to make Windows work as well as possible, so if there are simple tweaks we can do to cause performance to degrade more gracefully under conditions of peaky load, can we not discuss it? Also I couldn't see where it specifically sets a high number of pollers high for Windows? (And in which fork?? :-) And could you elaborate please on that last statement? "I'm not sure this whole thing will still be relevant then". DeferAccept on Unix makes accept return a socket only if it has data available. Of course, this is much faster, but I'm not sure about its support status on any OS. Setting the options in done in a regular thread due to possible SSL processing (and just to be safe overall). Maybe an option there to do that in the accept thread would be decent (obviously, only useful if there's no SSL and no deferAccept; in theory, although setSocketOptions is cheap, Poller.add does sync, which is a problem since it's bad if the accept thread is blocked for any reason, so I don't know if that would work better in the real world). There seems to be two distinct aspects to this deferAccept thing. One is what happens with the socket options. (And as I understand it this option is only supported in Linux 2.6 anyway). The other - which is in the AprEndpoint code, concerns the processing of the new connection. Just on that note, I have a question about this bit of code: if (!deferAccept) { if (setSocketOptions(socket)) { getPoller().add(socket); } else { // Close socket and pool Socket.destroy(socket); socket = 0; } } else { // Process the request from this socket if (!setSocketOptions(socket) || handler.process(socket) == Handler.SocketState.CLOSED) { // Close socket and pool Socket.destroy(socket); socket = 0; The default value of deferAccept is true, but on Windows this option is not supported in the TCP/IP stack, so there is code that falsifies the flag if this is the case. In which case, the socket is added straight to the poller. I'm happy with that approach anyway. But, the act of getting it across to the poller - which should be a relatively quick operation (?) requires the use of a worker thread from the common pool. This gets back to my original point. If the new connection could be pushed across to the poller asap, (without handling the request), and without having to rely on the worker threads, then surely this is going to degrade more gracefully than the current situation where a busy server is going to leave things in the backlog for quite some time. Which is a problem with a relatively small backlog. In the Tomcat branch, there is code to have multiple acceptor threads, with a remark that it doesn't seem to work that well if you do. So that being the case, why not push it straight across to the poller in the context of the acceptor thread? ...MT
Re: Tomcat Native: APR Http Endpoint handler behavior under Windows
Why is poller performance bad in Windows? Is that a consequence of the I've been told it uses select, which works only on 64 sockets at a time. So if you have a large poller, then the poll call performance degrades. Mladen is really the only one who can answer Windows questions. (personally, I think it is a lot cause for really high scalability) Hmm...I looked into this...the limit of 64 is a MS compile-time thing ... and in APR it appears that the constant is actually being re-defined to 1024 for WIN32 in the poller code. Hence I would not expect that under Windows this would actually be too much of an issue...1024 sockets per poller doesn't seem an unreasonable idea... in which case (if true) the performance implication of having a small number of poller threads is completely dwarfed by the DoS performance issue I have raised here. Be that as it may, this is a digression from my main issue... I think APR is by far the best technology for implementing Servlet 3.0, but I'm not so sure beyond that. The Vista+ enhancements need a new major version of APR, so it will take some time (too much IMP). Yep - we have to work with what we've got... (Sidebar: to your knowledge is there a working implementation of Servlet 3.0 yet in JBoss?) The default value of deferAccept is true, but on Windows this option is not supported in the TCP/IP stack, so there is code that falsifies the flag if this is the case. In which case, the socket is added straight to the poller. I'm happy with that approach anyway. But, the act of getting it across to the poller - which should be a relatively quick operation (?) requires the use of a worker thread from the common pool. This gets back to my original point. If the new connection could be pushed across to the poller asap, (without handling the request), and without having to rely on the worker threads, then surely this is going to degrade more gracefully than the current situation where a busy server is going to leave things in the backlog for quite some time. Which is a problem with a relatively small backlog. Yes, but (as I said in my previous email): - Poller.add does sync. This is bad for the Acceptor thread, but it might be ok. - Socket options also does SSL handshake, which is not doable in the Acceptor thread. So (as I was also saying) if there is no SSL and deferAccept is false, it is possible to have a configuration option to have the Acceptor thread set the options and put the socket in the poller without using a thread from the pool. That is, if you tested it and you found it worked better for you. I might give that a try...I'm less concerned about SSL as I would not anticipate SSL connection floods in the scenarios I'm considering. Otherwise, even with Poller.add doing a sync, surely that is still generally an improved situation versus having to contend for worker threads to get the connection open? Also, you are supposed to have a large maxThreads in the pool if you want to scale. Otherwise, although it might work well 99% of the time since thread use is normally limited due to the poller for keepalive, it's very easy to DoS your server. BTW, it's rather cheap (on Linux, at least ;) ). You're quite right...with a small number of Worker threads, as things stand it would be easy to get DoS for /new connections/...that is exactly my point. I'm specifically looking at Comet situations here...large numbers of long-lived connections potentially. But I wouldn't want maxThreads to be more than a few hundred to service several thousand connections...that was one major reason for using NIO/APR. In the Tomcat branch, there is code to have multiple acceptor threads, That stuff wasn't really implemented. I don't think it's such a good idea to be too efficient to accept and poll, if all it's going to do is blow up the app server (which would probably be even more challenged by the burst than the connector). The network stack is actually a decent place to smooth things out a little (or you could implement a backlog as part of the accept process). Maybe. I agree the network stack is not a bad place to handle it, except for the fact that stupid Windows wants to send a reset if the (size limited) backlog fills. If you implement an applic backlog on top of the O/S backlog, to me that is much the same as moving it straight off to the poller...just a question of which queue it sits in. I'll see if I can hack the AprEndpoint to test out what would happen. Cheers, MT
Re: Tomcat Native: APR Http Endpoint handler behavior under Windows
Hi Mladen...I'd be also be interested in your thoughts about the main aspect to my question concerning JBossWeb behaviour with APR in regards to new connection establishment. Specifically, because on Win32 the backlog queue is limited to 200, it's easy to get not very graceful DoS behavior to a relatively small flood of new connections if the worker threads are all rather busy, because a worker thread is needed to get the connection open. Assume that SSL is switched off. I proposed that the socket should be moved over to the poller straight away (with a deferred accept on the actual request) and that this be done in the context of the acceptor thread? Cheers...MT Mladen Turk wrote: Remy Maucherat wrote: Why is poller performance bad in Windows? Is that a consequence of the I've been told it uses select, which works only on 64 sockets at a time. So if you have a large poller, then the poll call performance degrades. Mladen is really the only one who can answer Windows questions. (personally, I think it is a lot cause for really high scalability) Well, we are forcing it to 1024 by setting the FD_SIZE before including the socket. Anyhow, this still requires multiple poller threads. The Vista+ windows has WSAPoll which is equivalent even on the API level to unix poll, and it doesn't bring any limitation on the pollset size. I've tested it with 32K sockets, and there is no performance degradations caused by the size like there is on the select implementation. I've added the WASPoll support to APR2, but there are some other things with APR2 that are still to be resolved before we can create tomcat-native 2. It merged with apr-util and introduced feature modules, so instead one it might end up with tens of dll's. Regards - To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org