A couple of things I've learned along the way ...

I had a similar architecture where we used fairly low numbers for
auto-commits with openSearcher=false. This keeps the tlog to a
reasonable size. You'll need something on the client side to send in
the hard commit request to open a new searcher every N docs or M
minutes.

Be careful with raising the Zk timeout as that also determines how
quickly Zk can detect a node has crashed (afaik). In other words, it
takes the zk client timeout seconds for Zk to consider an ephemeral
znode as "gone", so I caution you in increasing this value too much.

The other thing to be aware of is this leaderVoteWait safety mechanism
... might see log messages that look like:

2013-06-24 18:12:40,408 [coreLoadExecutor-4-thread-1] INFO
solr.cloud.ShardLeaderElectionContext  - Waiting until we see more
replicas up: total=2 found=1 timeoutin=139368

>From Mark M: This is a safety mechanism - you can turn it off by
configuring leaderVoteWait to 0 in solr.xml. This is meant to protect
the case where you stop a shard or it fails and then the first node to
get started back up has stale data - you don't want it to just become
the leader. So we wait to see everyone we know about in the shard up
to 3 or 5 min by default. Then we know all the shards participate in
the leader election and the leader will end up with all updates it
should have. You can lower that wait or turn it off with 0.

NOTE: I tried setting it to 0 and my cluster went haywire, so consider
just lowering it but not making it zero ;-)

Max heap of 8GB seems overly large to me for 8M docs per shard esp.
since you're using MMapDirectory to cache the primary data structures
of your index in OS cache. I have run shards with 40M docs with 6GB
max heap and chose to have more aggressive cache eviction by using a
smallish LFU filter cache. This approach seems to spread the cost of
GC out over time vs. massive amounts of clean-up when a new searcher
is opened. With 8M docs, each cached filter will require about 1M of
memory, so it seems like you could run with a smaller heap. I'm not a
GC expert but found that having smaller heap and more aggressive cache
evictions reduced full GC's (and how long they run for) on my Solr
instances.

On Mon, Jul 22, 2013 at 8:09 AM, Shawn Heisey <s...@elyograg.org> wrote:
> On 7/22/2013 6:45 AM, Markus Jelsma wrote:
>> You should increase your ZK time out, this may be the issue in your case. 
>> You may also want to try the G1GC collector to keep STW under ZK time out.
>
> When I tried G1, the occasional stop-the-world GC actually got worse.  I
> tried G1 after trying CMS with no other tuning parameters.  The average
> GC time went down, but when it got into a place where it had to do a
> stop-the-world collection, it was worse.
>
> Based on the GC statistics in jvisualvm and jstat, I didn't think I had
> a problem.  The way I discovered that I had a problem was by looking at
> my haproxy load balancer -- sometimes requests would be sent to a backup
> server instead of my primary, because the ping request handler was
> timing out on the LB health check.  The LB was set to time out after
> five seconds.  When I went looking deeper with the GC log and some other
> tools, I was seeing 8-10 second GC pauses.  G1 was showing me pauses of
> 12 seconds.
>
> Now I use a heavily tuned CMS config, and there are no more LB switches
> to a backup server.  I've put some of my own information about my GC
> settings on my personal Solr wiki page:
>
> http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning
>
> I've got an 8GB heap on my systems running 3.5.0 (one copy of the index)
> and a 6GB heap on those running 4.2.1 (the other copy of the index).
>
> Summary: Just switching to the G1 collector won't solve GC pause
> problems.  There's not a lot of G1 tuning information out there yet.  If
> someone can come up with a good set of G1 tuning parameters, G1 might
> become better than CMS.
>
> Thanks,
> Shawn
>

Reply via email to