Azazel K <[email protected]> wrote:
> We have solr cluster with 2 shards running 2 nodes on each shard.
> They are beefy physical boxes with index size of 162 GB , RAM of
> about 96 GB and around 153M documents.

There is a non-trivial overhead for sharding: Using a single shard increases 
throughput. Have you tried with 1 shard to see if the latency is acceptable for 
that?

> Two times this week we have seen the thread usage spike from the
> usual 1000 to 4000 on all nodes at the same time and bring down
> the cluster.

First guess: You are updating too frequently and hitting multiple overlapping 
searchers, deteriorating performance which leads to more overlapping searchers 
and so on. Try looking in the log:
https://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F


Anyway, 1000 threads sounds high. How many CPUs are on your machines? 32 on 
each? That is a total of 128 CPUs for your 4 machines, meaning that each CPU is 
working on about 10 concurrent requests. They might be competing for resources: 
Have you tried limiting the amount of concurrent request and using a queue? 
That might give you better performance (and lower heap requirements a bit).

- Toke Eskildsen

Reply via email to