I agree about the 80Gb heap as a possible problem. A GC is essentially a linear scan of memory. More memory means a longer scan.
We run with an 8Gb heap. I’d try that. Test it by replaying logs from production against a test instance. You can use JMeter and the Apache access log sampler. https://jmeter.apache.org/usermanual/jmeter_accesslog_sampler_step_by_step.pdf wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Sep 12, 2014, at 7:10 AM, Shawn Heisey <s...@elyograg.org> wrote: > On 9/12/2014 7:36 AM, YouPeng Yang wrote: >> We build the SolrCloud using solr4.6.0 and jdk1.7.60 ,our cluster contains >> 360G*3 data(one core with 2 replica). >> Our cluster becomes unstable which means occasionlly it comes out long >> time full gc.This is awful,the full gc take long take that the solrcloud >> consider it as down. >> Normally full gc happens when the Old Generaion get 70%,and it is >> OK.However In the awfull condition,the percentage is highly above 70% ,and >> become 99% so that the long full gc happens,and the node is considered as >> down. >> We set he JVM parameters referring to the URL >> :*https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning >> <https://wiki.apache.org/solr/ShawnHeisey#GC_Tuning>*, the only difference >> is that we change the *-Xms48009m -Xmx48009m* to *-Xms49152M -Xmx81920M* . >> The appendix[1] is the output of the jstat when the awful full gc >> happens.I have marked the important port with red font hoping to be >> helpful. >> By the way,I have notice that Eden part of Young Generation takes 100% >> always during the awful condition happens,which I think it is a import >> indication. >> The SolrCloud will be used to support our applications as a very >> important part. >> Would you please give me any suggestion? Do I need to change the JDK >> version? > > My GC parameter page is getting around. :) > > Do you really need an 80GB heap? I realize that your index is 360GB ... > but if you really do need a heap that large, you may need to adjust your > configuration so you use a lot less heap memory. > > The red font you mentioned did not make it through, so I cannot tell > what lines you highlighted. > > I pulled your jstat output into a spreadsheet and calculated the length > of each GC. The longest GC in there took 1.903 seconds. It's the one > that had a GCT of 4450.332. For an 80GB heap, you couldn't hope for > anything better. Based on what I see here, I don't think GC is your > problem. If I read the other numbers on that 1.903 second GC line > correctly (not sure that I am), it dropped your Eden size from 100% to > 0% ... suggesting that you really don't need an 80GB heap. > > How much RAM does this machine have? For ideal performance, you'll need > your index size plus your heap size, which for you right now is 440 GB. > Normally you don't need the ideal memory size ... but you do need a > *significant* portion of it. I don't think I'd try running this index > with less than 256GB of RAM, and that's assuming a much lower heap size > than 80GB. > > Here's some general info about performance problems and possible solutions: > > http://wiki.apache.org/solr/SolrPerformanceProblems > > Thanks, > Shawn >