I have to echo what others have said. An 80G heap is waaaaaaay out the norm, especially when you consider the size of your indexes and the number of docs.
Understanding why you think you need that much heap should be your top priority. As has already been suggested, insuring docValues are set for all fields that are used for sorting, faceting and grouping is a must. Deep paging can hurt too. In addition I'd check the cache settings, do you have a huge filterCache? What about the other caches? One common mistake is to have very high cache settings, in your setup I'd stick with 512 to start. Without _data_ it's hard to say, so unless some of those settings don't help the next thing I'd do is a heap dump or put a profiler on the JVM and see where the heap is actually allocated. It's quite possible that you arrived at 80G with some mistaken assumptions and once those are cleared up you can reduce your heap a lot. You say "through a lot of trial and error", what exactly happens when you use, say, a 32G heap? OOMs? Slowdowns? This is also starving your OS cache where most of the Lucene index data is stored, see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Best, Erick On Thu, Oct 11, 2018 at 4:42 AM yasoobhaider <[email protected]> wrote: > > Hi Shawn, thanks for the inputs. > > I have uploaded the gc logs of one of the slaves here: > https://ufile.io/ecvag (should work till 18th Oct '18) > > I uploaded the logs to gceasy as well and it says that the problem is > consecutive full GCs. According to the solution they have mentioned, > increasing the heap size is a solution. But I am already running on a pretty > big heap, so don't think increasing the heap size is going to be a long term > solution. > > From what I understood from a bit more looking around, this is Concurrent > Mode Failure for CMS. I found an old blog mentioning the use of > XX:CMSFullGCsBeforeCompaction=1 to make sure that compaction is done prior > to next collection trigger. So if it is a fragmentation problem, this will > solve it I hope. > > I will also try out using docValues as suggested by Ere on a couple of > fields on which we make a lot of faceting queries to reduce memory usage on > the slaves. > > Please share any ideas that you may have from the gc logs analysis > > Thanks > Yasoob > > > > -- > Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html
