Update: issue resolved! Cranking up the maxThreads did the trick. Default is 200. I went with 2500 for grins and giggles and things work great. Now, even if I overwhelm the box with too many requests, when the requests back off the box continues to respond. And when I slam the server after it's been restarted (without having warmup queries), it acts as I wanted: queries are slow to respond (upwards of 30s) for the first couple minutes then they start to all be under 25ms and normalize at a very fast pace (obviously as the cache is warmed).
Christopher, I could have sworn I tried upping acceptCount, maxConnections and maxThreads in my testing, but with your prodding I tried it again - and that was the solution. I have a couple quick followup questions: - What is the downside of having a maxThreads, acceptCount and maxConnections really high? Obviously defaults are there for a reason - I'd like to know what the reasoning is. - Any reason I shouldnt use Tomcat? I just went with it because I figured it was extremely mature and was easy to use with apt-get :) I'll probably toy with the APR as suggested by Michael, as I like the idea of a non-blocking connector. -- Nate Fox Sr Systems Engineer o: 310.658.5775 m: 714.248.5350 Follow us @NEOGOV <http://twitter.com/NEOGOV> and on Facebook<http://www.facebook.com/neogov> NEOGOV <http://www.neogov.com/> is among the top fastest growing software companies in the USA, recognized by Inc 500|5000, Deloitte Fast 500, and the LA Business Journal. We are hiring!<http://www.neogov.com/#/company/careers> On Tue, Mar 26, 2013 at 5:56 PM, Chris Hostetter <hossman_luc...@fucit.org>wrote: > > : * When I set solrmeter to run 4000 queries/min, it will handle a few > : hundred queries and then tomcat will stop responding completely to > requests > : (even though according to lsof -i it is still listening and the java > : process is still running). > > have you tried tacking using jstack to generate a thread dump of the > server to see what it's doing? > > : * When I set solrmeter to run 1000 queries/min it runs fine. I can stop > : solrmeter after a couple of minutes at that pace and then run at > 4000/min > : without issue. > : > : It's as if it needs a ramp up time? Also, I noticed (regardless of ramp > up) > : that my setup cannot handle 8000/min. The reaction at 8k/min is the same > as > : if I were to run 4k/min without the ramp up. Of note, only the shard that > : solrmeter is pointed to stops responding. The other shard hums along > : without incident. > > Just to clarify: you're running a 2 node SolrCloud cluster, where each > node contains a unique shard, and pointing solrmeter at a single node for > the queries -- correct? > > Here's my hunch: you are probably hitting the limit of the number of > concurrent connections tomcat will allow (whatever it may be confiurged > ot in your setup). > > In the 8000/min case, you are probably maxing out that limit with direct > connections you issue from solrmeter to that single node. > > In the 4000/min case, each request you issue causes that single node to > fire off multiple requests to each shard, and since each shard exists on > only one node, you are garunteeing thta you double the number of > concurrent requests hitting that first node. > > in the case where you start w/ 1000/min, and then later ramp up to > 4000/min, you are probably causing enough of the queries to be warmed up > that they are in the caches on both nodes, so they can be served really > fast and return their results before you reach that max number of > concurrent connections after you ramp up. > > I'm no tomcat expert, but skimming hte docs, you may want to look at > settings like acceptCount, maxConnections, maxThreads, etc... > > -Hoss >