On Wed, Apr 6, 2011 at 11:16 PM, Mark Thomas <ma...@apache.org> wrote: > On 05/04/2011 10:50, Tim Whittington wrote: >> Is what's actually going on more like: >> >> APR: use maxConnections == pollerSize (smallest will limit, but if >> pollerSize < maxConnections then the socket backlog effectively won't >> be used as the poller will keep killing connections as they come in) >> >> NIO: use maxConnections to limit 'poller size' >> >> HTTP: use maxConnections. For keep alive situations, reduce >> maxConnections to something closer to maxThreads (the default config >> is 10,000 keepalive connections serviced by 200 threads with a 60 >> second keepalive timeout, which could lead to some large backlogs of >> connected sockets that take 50 minutes to get serviced) > > This is indeed the case. There are a number of issues with the current > BIO implementation. > > 1. Keep-alive timeouts > As per the TODO comment in Http11Processor#process(), the keep-alive > timeout needs to take account of the time spent in the queue. > > 2. The switch to a queue does result in the possibility of requests with > data being delayed by requests without data in keep-alive. > > 3. HTTP pipe-lining is broken (this is bug 50957 [1]). The sequence is: > - client sends 1 complete request and part of second request > - tomcat processes first request > - tomcat recycles the input buffer > - client sends remainder of second request > - tomcat sees an incomplete request and returns a 505 > There are variations of this depending on exactly how much of the second > request has been read by Tomcat at the point the input buffer is > recycled. Note that r1086349 [2] has protected against the worst of what > could go wrong (mixed responses etc) but has not fixed the underlying issue. > > The change that triggered all of the above issues is r822234 [3]. > > Reverting r822234 isn't an option as the async code depends on elements > of if. > > > The fix for issue 1 is simple so I do not intend to discuss it further. > > > The fix for issue 2 is tricky. The fundamental issue is that to resolve > it and to keep maxConnections >> maxThreads we need NIO like behaviour > from a BIO socket which just isn't possible. > Fixing 1 will reduce the maximum length of delay that any one request > might experience which will help but that won't address the fundamental > issue. > For sockets in keepalive, I considered trying to fake NIO behaviour by > using a read with a timeout of 1ms, catching the SocketTimeoutException > and returning them to the back of the queue if there is no data. The > overhead of that looks to be around 2-3ms for a 1ms timeout. I'm worried > about CPU usage but for a single thread this doesn't seem to be > noticeable. More testing with multiple threads is required. The timeout > could be tuned by looking at the current number of active threads, size > of the queue etc. but it is an ugly hack. > Returning to the pre [3] approach of disabling keep-alive once > connections > 75% of threads would fix this at the price of no longer > being able to support maxConnections >> maxThreads.
Yeah, I went down this track as well before getting to the "Just use APR/NIO" state of mind. It is an ugly hack, but might be workable if the timeout is large enough to stop it being a busy loop on the CPU. With 200 threads, even a 100ms timeout would give you a 'reasonable' throughput. Even if we do this, I still think maxConnections should be somewhat closer to maxThreads than it is now if the BIO connector is being used. > I thought of two options for issue 3: > a) Assign a processor (+ inputbuffer, output buffer etc.) to a socket > and don't recycle it until the socket is closed. > - Increases memory requirements. > - Fixes issue 3 > - Retains current request processing order. > > b) Check the input buffer at the end of the loop in > Http11Processor#process() and process the next request if there is any > data in the input buffer. > - No Increase in memory requirements. > - Fixes issue 3 > - Pipelined requests will get processed earlier (before they would have > been placed at the back of the request processing queue) > > I think option b) is the way to go to fix issue +1 It's an unfair scheduling, but given issue 2 that's a fairly moot point with the BIO connector. > > > The fixes for 1 & 3 seem fairly straight forward and unless anyone > objects, I'll go ahead (when I get a little time) and implement those. I > think the fix for 2 needs some further discussion. What do folks think? > > > Mark > > > > [1] https://issues.apache.org/bugzilla/show_bug.cgi?id=50957 > [2] http://svn.apache.org/viewvc?rev=1086349&view=rev > [3] http://svn.apache.org/viewvc?view=rev&rev=823234 > >> >> cheers >> tim >> >> On Tue, Apr 5, 2011 at 8:51 PM, Tim Whittington <t...@apache.org> wrote: >>> In the AJP standard implementation docs, the following are not >>> mentioned, although they're properties of AbstractEndpoint and >>> probably should work: >>> - bindOnInit >>> - maxConnections >>> Am I right in assuming these should be possible in the AJP connector >>> (my reading of the code indicates they are - just wanted to check if >>> something arcane was going on)? >>> >>> If so I'll update the docs. >>> >>> cheers >>> tim >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org >> For additional commands, e-mail: dev-h...@tomcat.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org > For additional commands, e-mail: dev-h...@tomcat.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org For additional commands, e-mail: dev-h...@tomcat.apache.org