Re: Solr Memory question

Shawn Heisey Fri, 01 Aug 2014 15:54:27 -0700

On 8/1/2014 3:17 PM, Ethan wrote:
> Our SolrCloud setup : 3 Nodes with Zookeeper, 2 running SolrCloud.
>
> Current dataset size is 97GB, JVM is 10GB, but 6GB is used(for less garbage
> collection time).  RAM is 96GB,
>
> Our softcommit is set to 2secs and hardcommit is set to 1 hour.
>
> We are suddenly seeing high disk and network IOs.  During search the leader
> usually logs one more query with it's node name and shard information -
>
> "{NOW=1406911121656&shard.url=
> chexjvassoms006.ch.expeso.com:52158/solr/Main......
> ids=-9223372036371158536,-9223372036373602680,-9223372036618637568,-9223372036371157736......&distrib=false&timeAllowed=2000&wt=javabin&isShard=true"
>
> The actually query didn't have any of this information.  This started just
> today and causing lot of latency issues.  We have had nodes go down several
> times today.


That query is from distributed search -- it's the query that actually
retrieves the documents from the shards after the results of the initial
query have been tabulated to determine which documents are needed.  The
"ids" parameter is what tells me this.

Do you know how long those autoSoftCommit operations take?  If you are
indexing frequently enough and the commits are taking longer than the
configured interval of two seconds, you may be having multiple commits
happening at the same time.  Soft commits are faster and use fewer
resources than hard commits, but they aren't even close to free --
they're going to hit the disk and memory very hard.

One thing to note:  An hour may be too long for the hard commit
interval.  Hard commits result in a new transaction log being started,
so on restart, Solr will replay all of the updates that occurred in the
last hour.  If your update rate is low, that might be acceptable, but if
the update rate is high, that could be a LOT of updates, making Solr
restarts *very* slow.

Thanks,
Shawn

Re: Solr Memory question

Reply via email to