In addition to the insightful pointers by Zisis and Erick, I would like to
mention an approach in the link below that I generally use to pinpoint
exactly which threads are causing the CPU spike. Knowing this you can
understand which aspect of Solr (search thread, GC, update thread etc) is
taking more CPU and develop a mitigation strategy accordingly. (eg: if it's
a GC thread, maybe try tuning the params or switch to G1 GC). Just helps to
take the guesswork out of the many possible causes. Of course the
suggestions received earlier are best practices and should be taken into
consideration nevertheless.

https://backstage.forgerock.com/knowledge/kb/article/a39551500

The hex number the author talks about in the link above is the native
thread id.

Best,
Rahul


On Wed, Oct 14, 2020 at 8:00 AM Erick Erickson <erickerick...@gmail.com>
wrote:

> Zisis makes good points. One other thing is I’d look to
> see if the CPU spikes coincide with commits. But GC
> is where I’d look first.
>
> Continuing on with the theme of caches, yours are far too large
> at first glance. The default is, indeed, size=512. Every time
> you open a new searcher, you’ll be executing 128 queries
> for autowarming the filterCache and another 128 for the queryResultCache.
> autowarming alone might be accounting for it. I’d reduce
> the size back to 512 and an autowarm count nearer 16
> and monitor the cache hit ratio. There’s little or no benefit
> in squeezing the last few percent from the hit ratio. If your
> hit ratio is small even with the settings you have, then your caches
> don’t do you much good anyway so I’d make them much smaller.
>
> You haven’t told us how often your indexes are
> updated, which will be significant CPU hit due to
> your autowarming.
>
> Once you’re done with that, I’d then try reducing the heap. Most
> of the actual searching is done in Lucene via MMapDirectory,
> which resides in the OS memory space. See:
>
> https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
> Finally, if it is GC, consider G1GC if you’re not using that
> already.
>
> Best,
> Erick
>
>
> > On Oct 14, 2020, at 7:37 AM, Zisis T. <zist...@runbox.com> wrote:
> >
> > The values you have for the caches and the maxwarmingsearchers do not
> look
> > like the default. Cache sizes are 512 for the most part and
> > maxwarmingsearchers are 2 (if not limit them to 2)
> >
> > Sudden CPU spikes probably indicate GC issues. The #  of documents you
> have
> > is small, are they huge documents? The # of collections is OK in general
> but
> > since they are crammed in 5 Solr nodes the memory requirements might be
> > bigger. Especially if filter and the other caches get populated with 50K
> > entries.
> >
> > I'd first go through the GC activity to make sure that this is not
> causing
> > the issue. The fact that you lose some Solr servers is also an indicator
> of
> > large GC pauses that might create a problem when Solr communicates with
> > Zookeeper.
> >
> >
> >
> > --
> > Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html
>
>

Reply via email to