Ciao Shawn, thanks for your reply. > The oom script just kills Solr with the KILL signal (-9) and logs the kill. I know. But my feeling is, that not even this "happens", i.e. the script is not being executed. At least I see no solr_oom_killer-$SOLR_PORT-$NOW.log file ...
Btw: Who re-starts solr after it's been killed? > FYI, the stacktrace on the OOM error, especially in a multi-threaded app like > Solr, >will frequently be completely useless in tracking down the problem. I agree > I don't know if a heap dump on OOM is compatible with the OOM script. >If Java chooses to run the OOM script before the heap dump is done, the >process >will be killed before the heap finishes dumping. What if I did a jmap-call in the oom-script before killing the process? -Clemens -----Ursprüngliche Nachricht----- Von: Shawn Heisey [mailto:apa...@elyograg.org] Gesendet: Mittwoch, 3. Juni 2015 09:16 An: solr-user@lucene.apache.org Betreff: Re: Solr OutOfMemory but no heap and dump and oo_solr.sh is not triggered On 6/3/2015 12:20 AM, Clemens Wyss DEV wrote: > Context: Lucene 5.1, Java 8 on debian. 24G of RAM whereof 16G available for > Solr. > > I am seeing the following OOMs: > ERROR - 2015-06-03 05:17:13.317; [ customer-1-de_CH_1] > org.apache.solr.common.SolrException; null:java.lang.RuntimeException: > java.lang.OutOfMemoryError: Java heap space <snip> > Caused by: java.lang.OutOfMemoryError: Java heap space > WARN - 2015-06-03 05:17:13.319; [ customer-1-de_CH_1] > org.eclipse.jetty.servlet.ServletHandler; Error for > /solr/customer-1-de_CH_1/suggest_phrase > java.lang.OutOfMemoryError: Java heap space > > The full commandline is > /usr/local/java/bin/java -server -Xss256k -Xms16G -Xmx16G > -XX:NewRatio=3 -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 > -XX:MaxTenuringThreshold=8 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 > -XX:+CMSScavengeBeforeRemark -XX:PretenureSizeThreshold=64m > -XX:+UseCMSInitiatingOccupancyOnly > -XX:CMSInitiatingOccupancyFraction=50 > -XX:CMSMaxAbortablePrecleanTime=6000 -XX:+CMSParallelRemarkEnabled > -XX:+ParallelRefProcEnabled -verbose:gc -XX:+PrintHeapAtGC > -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps > -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime > -Xloggc:/opt/solr/logs/solr_gc.log -Djetty.port=8983 -DSTOP.PORT=7983 > -DSTOP.KEY=solrrocks -Duser.timezone=UTC > -Dsolr.solr.home=/opt/solr/data -Dsolr.install.dir=/usr/local/solr > -Dlog4j.configuration=file:/opt/solr/log4j.properties > -jar start.jar -XX:OnOutOfMemoryError=/usr/local/solr/bin/oom_solr.sh > 8983 /opt/solr/logs OPTIONS=default,rewrite > > So I'd expect /usr/local/solr/bin/oom_solr.sh tob e triggered. But this does > not seem to "happen". What am I missing? Is it o to pull a heapdump from Solr > before killing/rebooting in oom_solr.sh? > > Also I would like to know what query parameters were sent to > /solr/customer-1-de_CH_1/suggest_phrase (which may be the reason fort he OOM > ... The oom script just kills Solr with the KILL signal (-9) and logs the kill. That's it. It does not attempt to make a heap dump. If you *want* to dump the heap on OOM, you can, with some additional options: http://stackoverflow.com/questions/542979/using-heapdumponoutofmemoryerror-parameter-for-heap-dump-for-jboss/20496376#20496376 I don't know if a heap dump on OOM is compatible with the OOM script. If Java chooses to run the OOM script before the heap dump is done, the process will be killed before the heap finishes dumping. FYI, the stacktrace on the OOM error, especially in a multi-threaded app like Solr, will frequently be completely useless in tracking down the problem. The thread that makes the triggering memory allocation may be completely unrelated. This error happened on a suggest handler ... but the large memory allocations may be happening in a completely different part of the code. We have not had any recent indications of a memory leak in Solr. Memory leaks in Solr *do* happen, but they are usually caught by the tests. which run in a minimal memory space. The project has continuous integration servers set up that run all the tests many times per day. If you are running out of heap with 16GB allocated, then either your Solr installation is enormous or you've got a configuration that's not tuned properly. With a very large Solr installation, you may need to simply allocate more memory to the heap ... which may mean that you'll need to install more memory in the server. The alternative would be figuring out where you can change your configuration to reduce memory requirements. Here's some incomplete info on settings and situations that can require a very large heap: https://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap To provide much help, we'll need lots of details about your system ... number of documents in all cores, total index size on disk, your config, possibly your schema, and maybe a few other things I haven't thought of yet. Thanks, Shawn