Re: Full Indexing is Causing a Java Heap Out of Memory Exception

Ahmet Arslan Thu, 03 Apr 2014 17:48:07 -0700

Hi,

out of curiosity, why did you set ramBufferSizeMB to 6?


Ahmet




On Friday, April 4, 2014 3:27 AM, Candygram For Mongo 
<candygram.for.mo...@gmail.com> wrote:
*Main issue: Full Indexing is Causing a Java Heap Out of Memory Exception

*SOLR/Lucene version: *4.2.1*

*JVM version:

Java(TM) SE Runtime Environment (build 1.7.0_07-b11)

Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)



*Indexer startup command:

set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m



java " %JVMARGS% ^

-Dcom.sun.management.jmxremote.port=1092 ^

-Dcom.sun.management.jmxremote.ssl=false ^

-Dcom.sun.management.jmxremote.authenticate=false ^

-jar start.jar



*SOLR indexing HTTP parameters request:

webapp=/solr path=/dataimport
params={clean=false&command=full-import&wt=javabin&version=2}



We are getting a Java heap OOM exception when indexing (updating) 27
million records.  If we increase the Java heap memory settings the problem
goes away but we believe the problem has not been fixed and that we will
eventually get the same OOM exception.  We have other processes on the
server that also require resources so we cannot continually increase the
memory settings to resolve the OOM issue.  We are trying to find a way to
configure the SOLR instance to reduce or preferably eliminate the
possibility of an OOM exception.



We can reproduce the problem on a test machine.  We set the Java heap
memory size to 64MB to accelerate the exception.  If we increase this
setting the same problems occurs, just hours later.  In the test
environment, we are using the following parameters:



JVMARGS=-XX:MaxPermSize=64m -Xss256K -Xmx64m -Xms64m



Normally we use the default solrconfig.xml file with only the following jar
file references added:



<lib path="../../../../default/lib/common.jar" />

<lib path="../../../../default/lib/webapp.jar" />

<lib path="../../../../default/lib/commons-pool-1.4.jar" />



Using these values and trying to index 6 million records from the database,
the Java Heap Out of Memory exception is thrown very quickly.



We were able to complete a successful indexing by further modifying the
solrconfig.xml and removing all or all but one <copyfield> tags from the
schema.xml file.



The following solrconfig.xml values were modified:



<ramBufferSizeMB>6</ramBufferSizeMB>



<mergePolicy class="org.apache.lucene.index.TieredMergePolicy">

<int name="maxMergeAtOnce">2</int>

<int name="maxMergeAtOnceExplicit">2</int>

<int name="segmentsPerTier">10</int>

<int name="maxMergedSegmentMB">150</int>

</mergePolicy>



<autoCommit>

<maxDocs>15000</maxDocs>  <!--     This tag was maxTime, before this -- >

<openSearcher>false</openSearcher>

</autoCommit>



Using our customized schema.xml file with two or more <copyfield> tags, the
OOM exception is always thrown.  Based on the errors, the problem occurs
when the process was trying to do the merge.  The error is provided below:



Exception in thread "Lucene Merge Thread #156"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.OutOfMemoryError: Java heap space

                at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:541)

                at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:514)

Caused by: java.lang.OutOfMemoryError: Java heap space

                at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:180)

                at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:146)

                at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:301)

                at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:259)

                at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:233)

                at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:137)

                at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3693)

                at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3296)

                at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:401)

                at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:478)

Mar 12, 2014 12:17:40 AM org.apache.solr.common.SolrException log

SEVERE: auto commit error...:java.lang.IllegalStateException: this writer
hit an OutOfMemoryError; cannot commit

                at
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:3971)

                at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2744)

                at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2827)

                at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2807)

                at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:536)

                at
org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)

                at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

                at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

                at java.util.concurrent.FutureTask.run(FutureTask.java:166)

                at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)

                at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)

                at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

                at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

                at java.lang.Thread.run(Thread.java:722)



We think but are not 100% sure that the problem is related to the merge.



Normally our schema.xml contains a lot of field specifications (like the
ones seen in the file fragment below):



<copyField source="ADDRESS.RECORD_ID_abc" dest="ADDRESS.RECORD_ID.case_abc"
/>

<copyField source="ADDRESS.RECORD_ID_abc"
dest="ADDRESS.RECORD_ID.case.soundex_abc" />

<copyField source="ADDRESS.RECORD_ID_abc"
dest="ADDRESS.RECORD_ID.case_nvl_abc" />



In tests using the default file schema.xml and no <copyfield> tags,
indexing completed successfully.  6 million records produced a 900 MB data
directory.



When I included just one <copyfield> tag, indexing completed successfully.  6
million records produced a 990 MB data directory (90 MB bigger).



When I included just two <copyfield> tags, the index crashed with an OOM
exception.



Changing parameters like maxMergedSegmentMB or maxDocs, only postponed the
crash.



The net of our test results I as follows:



*solrconfig.xml*

*schema.xml*

*result*

default plus only jar references

default (no copyfield tags)

success

default plus only jar references

modified with one copyfield tag

success

default plus only jar references

modified with two copyfield tags

crash

additional modified settings

default (no copyfield tags)

success

additional modified settings

modified with one copyfield tag

success

additional modified settings

modified with two copyfield tags

crash





Our question is, what can we do to eliminate these OOM exceptions?

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

Reply via email to