Re: Full Indexing is Causing a Java Heap Out of Memory Exception

Ahmet Arslan Mon, 07 Apr 2014 13:54:12 -0700

Hi,

I had similar problems before. We were trying to do same thing as you, fetching 
too many small documents from Oracle with dih. We were getting


Caused by: java.sql.SQLException: ORA-01652: unable to extend temp segment by 
128 in tablespace TS_TEMP ORA-06512: at "IZCI.GET_FEED_KEYWORDS", line 20 at 
oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:450) at 
oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:399) at 
oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:837) at 
oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:459) at 
oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:193) at 
oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531) at 
oracle.jdbc.driver.T4CStatement.doOall8(T4CStatement.java:197) at 
oracle.jdbc.driver.T4CStatement.fetch(T4CStatement.java:1348) at 
oracle.jdbc.driver.OracleResultSetImpl.close_or_fetch_from_next(OracleResultSetImpl.java:635)
 at oracle.jdbc.driver.OracleResultSetImpl.next(OracleResultSetImpl.java:514) 
at 
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.hasnext(JdbcDataSource.java:334)
 ... 12 more


DB admins explained it but I don't remember. At the end we sliced our SQL 
sentence and did smaller imports with clean=false. 


Ahmet


On Monday, April 7, 2014 11:00 PM, Candygram For Mongo <> wrote:
I wanted to take a moment and say thank you for your help.  We haven't
solved the problem yet but it seems like we may be on the path.

Responses to your questions below:

1) We are using settings of 6GBs for -Xmx and -Xms on a production server
where this process is failing on about 30 million relatively small records.
We have the need to execute the same processes on much larger data sets
(10x or more).  There seems to be a somewhat linear requirement for memory
which is not sustainable.

2) We do not use the MDSolrDIHTransformer.jar.  That jar is some legacy
code that is commented out.  We are using the following jars:
common.jar, webapp.jar, commons-pool-1.4.jar.
The first two have our custom code in it that include filters.  The last
is from Apache.

3) We have Solr configured to switch what it uses based on the environment.
Looking at the INFOSTREAM.txt file, it is using MMap in the environment in
question.

4) Incrementing the batchSize to 5,000 or 10,000 accelerates the OOM error
(using the 64MB heap size) and it is not able to execute the query.  See
the error below:



*java.sql.SQLException: Protocol violation: [2]*

*        at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:527)*

*        at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227)*

*        at
oracle.jdbc.driver.T4C7Ocommoncall.doOLOGOFF(T4C7Ocommoncall.java:61)*

*        at oracle.jdbc.driver.T4CConnection.logoff(T4CConnection.java:574)*

*        at
oracle.jdbc.driver.PhysicalConnection.close(PhysicalConnection.java:4011)*

*        at
org.apache.solr.handler.dataimport.JdbcDataSource.closeConnection(JdbcDataSource.java:410)*

*        at
org.apache.solr.handler.dataimport.JdbcDataSource.close(JdbcDataSource.java:395)*

*        at
org.apache.solr.handler.dataimport.DocBuilder.closeEntityProcessorWrappers(DocBuilder.java:284)*

*        at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:273)*

*        at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)*

*        at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)*

*        at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)*



*Apr 07, 2014 11:11:54 AM org.apache.solr.common.SolrException log*

*SEVERE: Full Import failed:java.lang.RuntimeException:
java.lang.RuntimeException: org.apache.solr.handler.dataimport.Data*

*        at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:266)*

*        at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:422)*

*        at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:487)*

*        at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:468)*

*Caused by: java.lang.RuntimeException:
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.OutOfMemor*

*        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:406)*

*        at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:319)*

*        at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:227)*

*        ... 3 more*

*Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.OutOfMemoryError: Java heap space*

*        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:535)*

*        at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)*



We also suspect that the copyfield may be the culprit.  We are trying the
CSV process now.





On Sat, Apr 5, 2014 at 3:16 AM, Ahmet Arslan <iori...@yahoo.com> wrote:

> Hi,
>
> Now we have a more informative error
> : org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.OutOfMemoryError: Java heap space
>
> Caused by: org.apache.solr.handler.dataimport.DataImportHandlerException:
> java.lang.OutOfMemoryError: Java heap space
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:535)
>         at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)
>
> 1) Does this happen when you increase -Xmx64m -Xms64m ?
>
> 2) I see you use custom jars called "MDSolrDIHTransformer JARs inside"
>  But I don't see any Transformers used in database.xm, why is that. I would
> remove them just to be sure.
>
> 3) I see you have org.apache.solr.core.StandardDirectoryFactory declared
> in sorlconfig. Assuming you are using, 64 bit windows, it is recommended to
> use MMap
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>
>
> 4) In your previous mail you had batch size set, now there is not
> batchSize defined in database.xml. For MySQL it is recommended to use -1.
> Not sure about oracle, I personally used 10,000 once for Oracle.
> http://wiki.apache.org/solr/DataImportHandlerFaq#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F
>
> You have a lot of copyFields defined. There could be some gotchas when
> handling unusually much copy fields. I would really try CSV option here.
> Given that you have only full import SQL defined and it is not a complex
> one. It queries only one table. I believe Oracle has some tool to export a
> table to CSV file efficiently.
>
> On Saturday, April 5, 2014 3:05 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
>
> Does this user list allow attachments?  I have four files attached
> (database.xml, error.txt, schema.xml, solrconfig.xml).  We just ran the
> process again using the parameters you suggested, but not to a csv file.
>  It errored out quickly.  We are working on the csv file run.
>
> Removed both <autoCommit> and
> <autoSoftCommit> parts/definitions from solrconfig.xml
>
> Disabled tlog by removing
>    <updateLog>
>       <str
> name="dir">${solr.ulog.dir:}</str>
>     </updateLog>
>
> from solrconfig.xml
>
> Used commit=true parameter.
> ?commit=true&command=full-import
>
>
>
>
> On Fri, Apr 4, 2014 at 3:29 PM, Ahmet Arslan <iori...@yahoo.com> wrote:
>
> Hi,
> >
> >This may not solve your problem but generally it is recommended to
> disable auto commit and transaction logs for bulk indexing.
> >And issue one commit at the very end. Do you tlogs enabled? I see "commit
> failed" in the error message thats why I am offering this.
> >
> >And regarding comma separated values, with this approach you focus on
> just solr importing process. You separate data acquisition phrase. And it
> is very fast load even big csv files
> http://wiki.apache.org/solr/UpdateCSV
> >I have never experienced OOM during indexing, I suspect data acquisition
> has role in it.
> >
> >Ahmet
> >
> >
> >On Saturday, April 5, 2014 1:18 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
> >
> >We would be happy to try that.  That sounds counter intuitive for the
> high volume of records we have.  Can you help me understand how that might
> solve our problem?
> >
> >
> >
> >
> >On Fri, Apr 4, 2014 at 2:34 PM, Ahmet Arslan <iori...@yahoo.com> wrote:
> >
> >Hi,
> >>
> >>Can you remove auto commit for bulk import. Commit at the very end?
> >>
> >>Ahmet
> >>
> >>
> >>
> >>
> >>On Saturday, April 5, 2014 12:16 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
> >>In case the attached database.xml file didn't show up, I have pasted in
> the
> >>contents below:
> >>
> >><dataConfig>
> >><dataSource
> >>name="org_only"
> >>type="JdbcDataSource"
> >>driver="oracle.jdbc.OracleDriver"
> >>url="jdbc:oracle:thin:@test2.abc.com:1521:ORCL"
> >>user="admin"
> >>password="admin"
> >>readOnly="false"
> >>batchSize="100"
> >>/>
> >><document>
> >>
> >>
> >><entity name="full-index" query="
> >>select
> >>
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.RECORD_ID as varchar2(100)), 'null')
> >>as SOLR_ID,
> >>
> >>'ORCL.ADDRESS_ACCT_ALL'
> >>as SOLR_CATEGORY,
> >>
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.RECORD_ID as varchar2(255)), ' ') as
> >>ADDRESSALLROWID,
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.ADDR_TYPE_CD as varchar2(255)), ' ') as
> >>ADDRESSALLADDRTYPECD,
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.LONGITUDE as varchar2(255)), ' ') as
> >>ADDRESSALLLONGITUDE,
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.LATITUDE as varchar2(255)), ' ') as
> >>ADDRESSALLLATITUDE,
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.ADDR_NAME as varchar2(255)), ' ') as
> >>ADDRESSALLADDRNAME,
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.CITY as varchar2(255)), ' ') as
> >>ADDRESSALLCITY,
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.STATE as varchar2(255)), ' ') as
> >>ADDRESSALLSTATE,
> >>NVL(cast(ORCL.ADDRESS_ACCT_ALL.EMAIL_ADDR as varchar2(255)), ' ') as
> >>ADDRESSALLEMAILADDR
> >>
> >>from ORCL.ADDRESS_ACCT_ALL
> >>" >
> >>
> >><field column="SOLR_ID" name="id" />
> >><field column="SOLR_CATEGORY" name="category" />
> >><field column="ADDRESSALLROWID" name="ADDRESS_ACCT_ALL.RECORD_ID_abc" />
> >><field column="ADDRESSALLADDRTYPECD"
> >>name="ADDRESS_ACCT_ALL.ADDR_TYPE_CD_abc" />
> >><field column="ADDRESSALLLONGITUDE"
> name="ADDRESS_ACCT_ALL.LONGITUDE_abc" />
> >><field column="ADDRESSALLLATITUDE" name="ADDRESS_ACCT_ALL.LATITUDE_abc"
> />
> >><field column="ADDRESSALLADDRNAME" name="ADDRESS_ACCT_ALL.ADDR_NAME_abc"
> />
> >><field column="ADDRESSALLCITY" name="ADDRESS_ACCT_ALL.CITY_abc" />
> >><field column="ADDRESSALLSTATE" name="ADDRESS_ACCT_ALL.STATE_abc" />
> >><field column="ADDRESSALLEMAILADDR"
> name="ADDRESS_ACCT_ALL.EMAIL_ADDR_abc"
> >>/>
> >>
> >></entity>
> >>
> >>
> >>
> >><!-- Varaibles -->
> >><!-- '${dataimporter.last_index_time}' -->
> >></document>
> >></dataConfig>
> >>
> >>
> >>
> >>
> >>
> >>
> >>On Fri, Apr 4, 2014 at 11:55 AM, Candygram For Mongo <
> >>candygram.for.mo...@gmail.com> wrote:
> >>
> >>> In this case we are indexing an Oracle database.
> >>>
> >>> We do not include the data-config.xml in our distribution.  We store
> the
> >>> database information in the database.xml file.  I have attached the
> >>> database.xml file.
> >>>
> >>> When we use the default merge policy settings, we get the same results.
> >>>
> >>>
> >>>
> >>> We have not tried to dump the table to a comma separated file.  We
> think
> >>> that dumping this size table to disk will introduce other memory
> problems
> >>> with big file management. We have not tested that case.
> >>>
> >>>
> >>> On Fri, Apr 4, 2014 at 7:25 AM, Ahmet Arslan <iori...@yahoo.com>
> wrote:
> >>>
> >>>> Hi,
> >>>>
> >>>> Which database are you using? Can you send us data-config.xml?
> >>>>
> >>>> What happens when you use default merge policy settings?
> >>>>
> >>>> What happens when you dump your table to Comma Separated File and fed
> >>>> that file to solr?
> >>>>
> >>>> Ahmet
> >>>>
> >>>> On Friday, April 4, 2014 5:10 PM, Candygram For Mongo <
> >>>> candygram.for.mo...@gmail.com> wrote:
> >>>>
> >>>> The ramBufferSizeMB was set to 6MB only on the test system to make the
> >>>> system crash sooner.  In production that tag is commented out which
> >>>> I believe forces the default value to be used.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Apr 3, 2014 at 5:46 PM, Ahmet Arslan <iori...@yahoo.com>
> wrote:
> >>>>
> >>>> Hi,
> >>>> >
> >>>> >out of curiosity, why did you set ramBufferSizeMB to 6?
> >>>> >
> >>>> >Ahmet
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> >On Friday, April 4, 2014 3:27 AM, Candygram For Mongo <
> >>>> candygram.for.mo...@gmail.com> wrote:
> >>>> >*Main issue: Full Indexing is Causing a Java Heap Out of Memory
> Exception
> >>>> >
> >>>> >*SOLR/Lucene version: *4.2.1*
> >>>> >
> >>>> >
> >>>> >*JVM version:
> >>>> >
> >>>> >Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
> >>>> >
> >>>> >Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
> >>>> >
> >>>> >
> >>>> >
> >>>> >*Indexer startup command:
> >>>> >
> >>>> >set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m
> >>>> >
> >>>> >
> >>>> >
> >>>> >java " %JVMARGS% ^
> >>>> >
> >>>> >-Dcom.sun.management.jmxremote.port=1092 ^
> >>>> >
> >>>> >-Dcom.sun.management.jmxremote.ssl=false ^
> >>>> >
> >>>> >-Dcom.sun.management.jmxremote.authenticate=false ^
> >>>> >
> >>>> >-jar start.jar
> >>>> >
> >>>> >
> >>>> >
> >>>> >*SOLR indexing HTTP parameters request:
> >>>> >
> >>>> >webapp=/solr path=/dataimport
> >>>> >params={clean=false&command=full-import&wt=javabin&version=2}
> >>>> >
> >>>> >
> >>>> >
> >>>> >We are getting a Java heap OOM exception when indexing (updating) 27
> >>>> >million records.  If we increase the Java heap memory settings the
> >>>> problem
> >>>> >goes away but we believe the problem has not been fixed and that we
> will
> >>>> >eventually get the same OOM exception.  We have other processes on
> the
> >>>> >server that also require resources so we cannot continually increase
> the
> >>>> >memory settings to resolve the OOM issue.  We are trying to find a
> way to
> >>>> >configure the SOLR instance to reduce or preferably eliminate the
> >>>> >possibility of an OOM exception.
> >>>> >
> >>>> >
> >>>> >
> >>>> >We can reproduce the problem on a test machine.  We set the Java heap
> >>>> >memory size to 64MB to accelerate the exception.  If we increase this
> >>>> >setting the same problems occurs, just hours later.  In the test
> >>>> >environment, we are using the following parameters:
> >>>> >
> >>>> >
> >>>> >
> >>>> >JVMARGS=-XX:MaxPermSize=64m -Xss256K -Xmx64m -Xms64m
> >>>> >
> >>>> >
> >>>> >
> >>>> >Normally we use the default solrconfig.xml file with only the
> following
> >>>> jar
> >>>> >file references added:
> >>>> >
> >>>> >
> >>>> >
> >>>> ><lib path="../../../../default/lib/common.jar" />
> >>>> >
> >>>> ><lib path="../../../../default/lib/webapp.jar" />
> >>>> >
> >>>> ><lib path="../../../../default/lib/commons-pool-1.4.jar" />
> >>>> >
> >>>> >
> >>>> >
> >>>> >Using these values and trying to index 6 million records from the
> >>>> database,
> >>>> >the Java Heap Out of Memory exception is thrown very quickly.
> >>>> >
> >>>> >
> >>>> >
> >>>> >We were able to complete a successful indexing by further modifying
> the
> >>>> >solrconfig.xml and removing all or all but one <copyfield> tags from
> the
> >>>> >schema.xml file.
> >>>> >
> >>>> >
> >>>> >
> >>>> >The following solrconfig.xml values were modified:
> >>>> >
> >>>> >
> >>>> >
> >>>> ><ramBufferSizeMB>6</ramBufferSizeMB>
> >>>> >
> >>>> >
> >>>> >
> >>>> ><mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
> >>>> >
> >>>> ><int name="maxMergeAtOnce">2</int>
> >>>> >
> >>>> ><int name="maxMergeAtOnceExplicit">2</int>
> >>>> >
> >>>> ><int name="segmentsPerTier">10</int>
> >>>> >
> >>>> ><int name="maxMergedSegmentMB">150</int>
> >>>> >
> >>>> ></mergePolicy>
> >>>> >
> >>>> >
> >>>> >
> >>>> ><autoCommit>
> >>>> >
> >>>> ><maxDocs>15000</maxDocs>  <!--     This tag was maxTime, before this
> -- >
> >>>> >
> >>>> ><openSearcher>false</openSearcher>
> >>>> >
> >>>> ></autoCommit>
> >>>> >
> >>>> >
> >>>> >
> >>>> >Using our customized schema.xml file with two or more <copyfield>
> tags,
> >>>> the
> >>>> >OOM exception is always thrown.  Based on the errors, the problem
> occurs
> >>>> >when the process was trying to do the merge.  The error is provided
> >>>> below:
> >>>> >
> >>>> >
> >>>> >
> >>>> >Exception in thread "Lucene Merge Thread #156"
> >>>> >org.apache.lucene.index.MergePolicy$MergeException:
> >>>> >java.lang.OutOfMemoryError: Java heap space
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:541)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:514)
> >>>> >
> >>>> >Caused by: java.lang.OutOfMemoryError: Java heap space
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:180)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:146)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:301)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:259)
> >>>> >
> >>>> >                at
> >>>>
> >org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:233)
> >>>> >
> >>>> >                at
> >>>> >org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:137)
> >>>> >
> >>>> >                at
> >>>>
> >org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3693)
> >>>> >
> >>>> >                at
> >>>> >org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3296)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:401)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:478)
> >>>> >
> >>>> >Mar 12, 2014 12:17:40 AM org.apache.solr.common.SolrException log
> >>>> >
> >>>> >SEVERE: auto commit error...:java.lang.IllegalStateException: this
> writer
> >>>> >hit an OutOfMemoryError; cannot commit
> >>>> >
> >>>> >                at
> >>>>
> >org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:3971)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2744)
> >>>> >
> >>>> >                at
> >>>>
> >org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2827)
> >>>> >
> >>>> >                at
> >>>> >org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2807)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:536)
> >>>> >
> >>>> >                at
> >>>> >org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
> >>>> >
> >>>> >                at
> >>>>
> >java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>> >
> >>>> >                at
> >>>> >java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> >>>> >
> >>>> >                at
> >>>> java.util.concurrent.FutureTask.run(FutureTask.java:166)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> >>>> >
> >>>> >                at
> >>>>
> >>>>
> >java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> >>>> >
> >>>> >                at java.lang.Thread.run(Thread.java:722)
> >>>> >
> >>>> >
> >>>> >
> >>>> >We think but are not 100% sure that the problem is related to the
> merge.
> >>>> >
> >>>> >
> >>>> >
> >>>> >Normally our schema.xml contains a lot of field specifications (like
> the
> >>>> >ones seen in the file fragment below):
> >>>> >
> >>>> >
> >>>> >
> >>>> ><copyField source="ADDRESS.RECORD_ID_abc"
> >>>> dest="ADDRESS.RECORD_ID.case_abc"
> >>>> >/>
> >>>> >
> >>>> ><copyField source="ADDRESS.RECORD_ID_abc"
> >>>> >dest="ADDRESS.RECORD_ID.case.soundex_abc" />
> >>>> >
> >>>> ><copyField source="ADDRESS.RECORD_ID_abc"
> >>>> >dest="ADDRESS.RECORD_ID.case_nvl_abc" />
> >>>> >
> >>>> >
> >>>> >
> >>>> >In tests using the default file schema.xml and no <copyfield> tags,
> >>>> >indexing completed successfully.  6 million records produced a 900 MB
> >>>> data
> >>>> >directory.
> >>>> >
> >>>> >
> >>>> >
> >>>> >When I included just one <copyfield> tag, indexing completed
> >>>> successfully.  6
> >>>> >million records produced a 990 MB data directory (90 MB bigger).
> >>>> >
> >>>> >
> >>>> >
> >>>> >When I included just two <copyfield> tags, the index crashed with an
> OOM
> >>>> >exception.
> >>>> >
> >>>> >
> >>>> >
> >>>> >Changing parameters like maxMergedSegmentMB or maxDocs, only
> postponed
> >>>> the
> >>>> >crash.
> >>>> >
> >>>> >
> >>>> >
> >>>> >The net of our test results I as follows:
> >>>> >
> >>>> >
> >>>> >
> >>>> >*solrconfig.xml*
> >>>> >
> >>>> >*schema.xml*
> >>>> >
> >>>> >*result*
> >>>> >
> >>>> >
> >>>> >default plus only jar references
> >>>> >
> >>>> >default (no copyfield tags)
> >>>> >
> >>>> >success
> >>>> >
> >>>> >default plus only jar references
> >>>> >
> >>>> >modified with one copyfield tag
> >>>> >
> >>>> >success
> >>>> >
> >>>> >default plus only jar references
> >>>> >
> >>>> >modified with two copyfield tags
> >>>> >
> >>>> >crash
> >>>> >
> >>>> >additional modified settings
> >>>> >
> >>>> >default (no copyfield tags)
> >>>> >
> >>>> >success
> >>>> >
> >>>> >additional modified settings
> >>>> >
> >>>> >modified with one copyfield tag
> >>>> >
> >>>> >success
> >>>> >
> >>>> >additional modified settings
> >>>> >
> >>>> >modified with two copyfield tags
> >>>> >
> >>>> >crash
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> >
> >>>> >Our question is, what can we do to eliminate these OOM exceptions?
> >>>> >
> >>>> >
> >>>>
> >>>
> >>>
> >>
> >>
> >
>

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

Reply via email to