On Wed, 2011-09-28 at 12:58 +0200, Frederik Kraus wrote: > - 10 shards per server (needed for response times) running in a single tomcat > instance
Have you tested that sharding actually decreases response times in your case? I see the idea in decreasing response times with sharding at the cost of decreasing throughput, but the added overhead of merging is non-trivial. > - each query queries all 20 shards (distributed search) > > - each shard holds about 1.5 mio documents (small shards are needed due to > rather complex queries) > - all caches are warmed / high cache hit rates (99%) etc. > Now for some reason we cannot seem to fully utilize all CPU power (no disk > IO), ie. increasing concurrent users doesn't increase CPU-Load at a point, > decreases throughput and increases the response times of the individual > queries. It sounds as if there's a hard limit on the number of concurrent users somewhere. I am no expert in httpclient, but the blocked threads in your thread dump seems to indicate that they wait for connections to be established rather than for results to be produced. I seem to remember that tomcat has a default limit on 200 concurrent connections and with 10 shards/search, that is just 200 / (10 shard_connections + 1 incoming_connection) = 18 concurrent searches. > Also 1-2% of the queries take significantly longer: avg somewhere at 100ms > while 1-2% take 1.5s or longer. Could be garbage collection, especially since it shows under high load which might result in more old objects and thereby trigger full gc.