Check whether the oom killer script was called. If so, there will be
log files obviously relating to that. I've seen nodes mysteriously
disappear as a result of this with no message in the regular solr
logs. If that's the case, you need to increase your heap.

Erick

On Wed, Sep 18, 2019 at 8:21 AM Shawn Heisey <apa...@elyograg.org> wrote:
>
> On 9/18/2019 6:11 AM, Shawn Heisey wrote:
> > On 9/17/2019 9:35 PM, Hongxu Ma wrote:
> >> My questions:
> >>
> >>    *   Is this error possible caused by "long gc pause"? my solr
> >> zkClientTimeout=60000
> >
> > It's possible.  I can't say for sure that this is the issue, but it
> > might be.
>
> A followup.  I was thinking about the interactions here.  It looks like
> Solr only waits four seconds for the leader election, and both of the
> pauses you mentioned are longer than that.
>
> Four seconds is probably too short a time to wait, and I do not think
> that timeout is configurable anywhere.
>
> > What version of Solr do you have, and what is your max heap?  The CMS
> > garbage collection that Solr 5.0 and later incorporate by default is
> > pretty good.  My G1 settings might do slightly better, but the
> > improvement won't be dramatic unless your existing commandline has
> > absolutely no gc tuning at all.
>
> That question will be important.  If you already have our CMS GC tuning,
> switching to G1 probably is not going to solve this.  Lowering the max
> heap might be the only viable solution in that case, and depending on
> what you're dealing with, it will either be impossible or it will
> require more servers.
>
> Thanks,
> Shawn

Reply via email to