Re: SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs

Shawn Heisey Fri, 25 Jan 2013 08:59:01 -0800

On 1/25/2013 4:49 AM, Harish Verma wrote:

we are testing solr 4.1 running inside tomcat 7 and java 7 with  following
options


JAVA_OPTS="-Xms256m -Xmx2048m -XX:MaxPermSize=1024m -XX:+UseConcMarkSweepGC
-XX:+CMSIncrementalMode -XX:+ParallelRefProcEnabled
-XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/home/ubuntu/OOM_HeapDump"

our source code looks like following:
/**** START *****/
int noOfSolrDocumentsInBatch = 0;
for(int i=0 ; i<5000 ; i++) {
     SolrInputDocument solrInputDocument = getNextSolrInputDocument();
     server.add(solrInputDocument);
     noOfSolrDocumentsInBatch += 1;
     if(noOfSolrDocumentsInBatch == 10) {
         server.commit();
         noOfSolrDocumentsInBatch = 0;
     }
}
/**** END *****/

the method "getNextSolrInputDocument()" generates a solr document with 100
fields (average). Around 50 of the fields are of "text_general" type.
Some of the "test_general" fields consist of approx 1000 words rest
consists of few words. Ouf of total fields there are around 35-40
multivalued fields (not of type "text_general").
We are indexing all the fields but storing only 8 fields. Out of these 8
fields two are string type, five are long and one is boolean. So our index
size is only 394 MB. But the RAM occupied at time of OOM is around 2.5 GB.
Why the memory is so high even though the index size is small?
What is being stored in the memory? Our understanding is that after every
commit documents are flushed to the disk.So nothing should remain in RAM
after commit.

We are using the following settings:

server.commit() set waitForSearcher=true and waitForFlush=true
solrConfig.xml has following properties set:
directoryFactory = solr.MMapDirectoryFactory
maxWarmingSearchers = 1
text_general data type is being used as supplied in the schema.xml with the
solr setup.
maxIndexingThreads = 8(default)
<autoCommit><maxTime>15000</maxTime><openSearcher>false</openSearcher></autoCommit>

We get Java heap Out Of Memory Error after commiting around 3990 solr
documents.Some of the snapshots of memory dump from profiler are uploaded
at following links.
http://s9.postimage.org/w7589t9e7/memorydump1.png
http://s7.postimage.org/p3abs6nuj/memorydump2.png

can somebody please suggest what should we do to minimize/optimize the
memory consumption in our case with the reasons?
also suggest what should be optimal values and reason for following
parameters of solrConfig.xml
useColdSearcher - true/false?
     maxwarmingsearchers- number
     spellcheck-on/off?
     omitNorms=true/false?
     omitTermFreqAndPositions?
     mergefactor? we are using default value 10
     java garbage collection tuning parameters ?

Additional information is needed. What OS platform? Is the OS 64-bit?Is Java 64-bit? How much total RAM? We'll need your solrconfig.xmlfile, in particular the query and indexConfig sections. Use yourfavorite paste site (pastie.org, pastebin.com for example) to link thesolrconfig.xml file.


General thoughts without the above information:

You are reserving half of your java heap for the permanent generation.I have a solr installation where Java has a max heap of 8GB, about 5GBof that is currently committed - actually allocated at the OS level. Myperm gen space is 65908KB. This server handles a total index size ofnearly 70GB. I doubt you need 1GB for your perm gen size.

A 2GB heap is fairly small in the Solr world. If you are using a 32 bitjava, that's the biggest heap you can create, so 64-bit on both Java andOS is the way to go. You can reduce memory requirements a small amountby using Jetty instead of Tomcat, but the difference is probably not bigenough to really matter.

For the questions you asked at the end, most of them are personalpreference, but maxWarmingSearchers should normally be kept low. Ithink I have a value of 2 in my config. Here are the GC tuningparameters that I am currently testing. I have been having problemswith long GC pauses that I am trying to fix:


-Xms1024M
-Xmx8192M
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:NewRatio=3
-XX:MaxTenuringThreshold=8
-XX:+CMSParallelRemarkEnable

You should only use CMSIncrementalMode if you only have one or twoprocessor cores. My reading has suggested that when you have more, itis not beneficial.

So far my GC parameters seem to be working really well, but I need to doa full reindex which should force usage of the entire 8GB heap and pushgarbage collection to its limits.

I have a question of my own for someone familiar with the code. DoesSolr extensively use weak references? If so, ParallelRefProcEnabledmight be a win.


Thanks,
Shawn

Re: SOLR 4.1 Out Of Memory error After commit of a few thousand Solr Docs

Reply via email to