On 6/10/2019 3:24 AM, vishal patel wrote:
We have 27 collections and each collection has many schema fields and in live too 
many search and index create&update requests come and most of the searching 
requests are sorting, faceting, grouping, and long query.
So approx average 40GB heap are used so we gave 80GB memory.

Unless you've been watching an actual *graph* of heap usage over a significant amount of time, you can't learn anything useful from it.

And it's very possible that you can't get anything useful even from a graph, unless that graph is generated by analyzing a lengthy garbage collection log.

our directory in solrconfig.xml
<directoryFactory name="DirectoryFactory"
                     class="${solr.directoryFactory:solr.MMapDirectoryFactory}">
</directoryFactory>

When using MMAP, one of the memory columns should show a total that's approximately equal to the max heap plus the size of all indexes being handled by Solr. None of the columns in your Resource Monitor memory screenshot show numbers over 400GB, which is what I would expect based on what you said about the index size.

MMapDirectoryFactory is a decent choice, but Solr's default of NRTCachingDirectoryFactory is probably better. Switching to NRT will not help whatever is causing your performance problems, though.

Here our schema file and solrconfig XML and GC log, please verify it. is it 
anything wrong or suggestions for improvement?
https://drive.google.com/drive/folders/1wV9bdQ5-pP4s4yc8jrYNz77YYVRmT7FG

That GC log covers a grand total of three and a half minutes. It's useless. Heap usage is nearly constant for the full time at about 30GB. Without a much more comprehensive log, I cannot offer any useful advice. I'm looking for logs that lasts several hours, and a few DAYS would be better.

Your caches are commented out, so that is not contributing to heap usage. Another reason to drop the heap size, maybe.

2019-06-06T11:55:53.456+0100: 1053797.556: Total time for which application 
threads were stopped: 42.4594545 seconds, Stopping threads took: 26.7301882 
seconds

Part of the problem here is that stopping threads took 26 seconds. I have never seen anything that high before. It should only take a *small* fraction of a second to stop all threads. Something seems to be going very wrong here. One thing that it *might* be is something called "the four month bug", which is fixed by adding -XX:+PerfDisableSharedMem to the JVM options. Here's a link to the blog post about that problem:

https://www.evanjones.ca/jvm-mmap-pause.html

It's not clear whether the 42 seconds *includes* the 26 seconds, or whether there was 42 seconds of pause AFTER the threads were stopped. I would imagine that the larger number includes the smaller number. Might need to ask Oracle engineers. Pause times like this do not surprise me with a heap this big, but 26 seconds to stop threads sounds like a major issue, and I am not sure about what might be causing it. My guess about the four month bug above is a shot in the dark that might be completely wrong.

Thanks,
Shawn

Reply via email to