On 4/8/2014 5:30 PM, Utkarsh Sengar wrote:
> I see sudden drop in throughput once every 3-4 days. The "downtime" is for
> about 2-6minutes and things stabilize after that.
> 
> But I am not sure what is causing it the problem.
> 
> I have 3 shards with 20GB of data on each shard.
> Solr dashboard: http://i.imgur.com/6RWT2Dj.png
> Newrelic graphs when during the downtime of about 4hours:
> http://i.imgur.com/9vhKiB2.png
> JVM memory graph says its normal: http://i.imgur.com/pAycgdC.png
> 
> I thought it was GC pauses but it should be in the newrelic logs.
> 
> How can I go about investigating this problem? I am running solr 4.4.0, I
> don't see a strong reason to upgrade yet.

Lots of questions:

How many total machines?  What is your replicationFactor?  Does each
machine have one shard replica and therefore 20GB of total index data,
or if you add up all the index directories for the cores on each
machine, is there more than 20GB of data?

What options are you passing to your JVM when you start the servlet
container that runs Solr?

The dashboard says that this machine has 24GB of RAM and a 9GB heap.  Is
this the case for all machines?  Is there any software other than Solr
on the machine?

If it's a linux/unix machine, can you run top, press shift-M to sort by
memory, and grab a screenshot?  If it's a Windows machine, a similar
list should be available in the task manager, but it must include all
processes for all users on the whole machine, and it would be best if it
showed virtual memory as well as private.

Thanks,
Shawn

Reply via email to