Hi Matthew, Erick! Thank you very much for the feedback, I'll try to convince them to reduce the heap size.
current GC settings: -XX:+CMSParallelRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:+ParallelRefProcEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=50 -XX:CMSMaxAbortablePrecleanTime=6000 -XX:ConcGCThreads=4 -XX:MaxTenuringThreshold=8 -XX:NewRatio=3 -XX:ParallelGCThreads=4 -XX:PretenureSizeThreshold=64m -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 Kind regards, Karol wt., 6 paź 2020 o 16:52 Erick Erickson <erickerick...@gmail.com> napisał(a): > > 12G is not that huge, it’s surprising that you’re seeing this problem. > > However, there are a couple of things to look at: > > 1> If you’re saying that you have 16G total physical memory and are > allocating 12G to Solr, that’s an anti-pattern. See: > https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html > If at all possible, you should allocate between 25% and 50% of your physical > memory to Solr... > > 2> what garbage collector are you using? G1GC might be a better choice. > > > On Oct 6, 2020, at 10:44 AM, matthew sporleder <msporle...@gmail.com> wrote: > > > > Your index is so small that it should easily get cached into OS memory > > as it is accessed. Having a too-big heap is a known problem > > situation. > > > > https://cwiki.apache.org/confluence/display/SOLR/SolrPerformanceProblems#SolrPerformanceProblems-HowmuchheapspacedoIneed? > > > > On Tue, Oct 6, 2020 at 9:44 AM Karol Grzyb <grz...@gmail.com> wrote: > >> > >> Hi Matthew, > >> > >> Thank you for the answer, I cannot reproduce the setup locally I'll > >> try to convince them to reduce Xmx, I guess they will rather not agree > >> to 1GB but something less than 12G for sure. > >> And have some proper dev setup because for now we could only test prod > >> or stage which are difficult to adjust. > >> > >> Is being stuck in GC common behaviour when the index is small compared > >> to available heap during bigger load? I was more worried about the > >> ratio of heap to total host memory. > >> > >> Regards, > >> Karol > >> > >> > >> wt., 6 paź 2020 o 14:39 matthew sporleder <msporle...@gmail.com> > >> napisał(a): > >>> > >>> You have a 12G heap for a 200MB index? Can you just try changing Xmx > >>> to, like, 1g ? > >>> > >>> On Tue, Oct 6, 2020 at 7:43 AM Karol Grzyb <grz...@gmail.com> wrote: > >>>> > >>>> Hi, > >>>> > >>>> I'm involved in investigation of issue that involves huge GC overhead > >>>> that happens during performance tests on Solr Nodes. Solr version is > >>>> 6.1. Last test were done on staging env, and we run into problems for > >>>> <100 requests/second. > >>>> > >>>> The size of the index itself is ~200MB ~ 50K docs > >>>> Index has small updates every 15min. > >>>> > >>>> > >>>> > >>>> Queries involve sorting and faceting. > >>>> > >>>> I've gathered some heap dumps, I can see from them that most of heap > >>>> memory is retained because of object of following classes: > >>>> > >>>> -org.apache.lucene.search.grouping.term.TermSecondPassGroupingCollector > >>>> (>4G, 91% of heap) > >>>> -org.apache.lucene.search.grouping.AbstractSecondPassGroupingCollector$SearchGroupDocs > >>>> -org.apache.lucene.search.FieldValueHitQueue$MultiComparatorsFieldValueHitQueue > >>>> -org.apache.lucene.search.TopFieldCollector$SimpleFieldCollector > >>>> (>3.7G 76% of heap) > >>>> > >>>> > >>>> > >>>> Based on information above is there anything generic that can been > >>>> looked at as source of potential improvement without diving deeply > >>>> into schema and queries (which may be very difficlut to change at this > >>>> moment)? I don't see docvalues being enabled - could this help, as if > >>>> I get the docs correctly, it's specifically helpful when there are > >>>> many sorts/grouping/facets? Or I > >>>> > >>>> Additionaly I see, that many threads are blocked on LRUCache.get, > >>>> should I recomend switching to FastLRUCache? > >>>> > >>>> Also, I wonder if -Xmx12288m for java heap is not too much for 16G > >>>> memory? I see some (~5/s) page faults in Dynatrace during the biggest > >>>> traffic. > >>>> > >>>> Thank you very much for any help, > >>>> Kind regards, > >>>> Karol >