Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

Erick Erickson Thu, 11 Oct 2018 07:36:12 -0700

I have to echo what others have said. An 80G heap is waaaaaaay out the norm,
especially when you consider the size of your indexes and the number of docs.

Understanding why you think you need that much heap should be your top
priority. As has already been suggested, insuring docValues are set for all
fields that are used for sorting, faceting and grouping is a must. Deep paging
can hurt too.

In addition I'd check the cache settings, do you have a huge
filterCache? What about
the other caches? One common mistake is to have very high cache
settings, in your setup
I'd stick with 512 to start.

Without _data_ it's hard to say, so unless some of those settings
don't help the next
thing I'd do is a heap dump or put a profiler on the JVM and see where
the heap is actually
allocated.

It's quite possible that you arrived at 80G with some mistaken
assumptions and once
those are cleared up you can reduce your heap a lot. You say "through
a lot of trial and error",
what exactly happens when you use, say, a 32G heap? OOMs? Slowdowns?

This is also starving your OS cache where most of the Lucene index
data is stored, see:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html

Best,
Erick

On Thu, Oct 11, 2018 at 4:42 AM yasoobhaider <[email protected]> wrote:
>
> Hi Shawn, thanks for the inputs.
>
> I have uploaded the gc logs of one of the slaves here:
> https://ufile.io/ecvag (should work till 18th Oct '18)
>
> I uploaded the logs to gceasy as well and it says that the problem is
> consecutive full GCs. According to the solution they have mentioned,
> increasing the heap size is a solution. But I am already running on a pretty
> big heap, so don't think increasing the heap size is going to be a long term
> solution.
>
> From what I understood from a bit more looking around, this is Concurrent
> Mode Failure for CMS. I found an old blog mentioning the use of
> XX:CMSFullGCsBeforeCompaction=1 to make sure that compaction is done prior
> to next collection trigger. So if it is a fragmentation problem, this will
> solve it I hope.
>
> I will also try out using docValues as suggested by Ere on a couple of
> fields on which we make a lot of faceting queries to reduce memory usage on
> the slaves.
>
> Please share any ideas that you may have from the gc logs analysis
>
> Thanks
> Yasoob
>
>
>
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

Reply via email to