I have lots of log files and other files to support this issue (sometimes referenced in the text below) but I am not sure the best way to submit. I don't want to overwhelm and I am not sure if this email will accept graphs and charts. Please provide direction and I will send them.
*Issue Description* We are getting Out Of Memory errors when we try to execute a full import using the Data Import Handler. This error originally occurred on a production environment with a database containing 27 million records. Heap memory was configured for 6GB and the server had 32GB of physical memory. We have been able to replicate the error on a local system with 6 million records. We set the memory heap size to 64MB to accelerate the error replication. The indexing process has been failing in different scenarios. We have 9 test cases documented. In some of the test cases we increased the heap size to 128MB. In our first test case we set heap memory to 512MB which also failed. *Environment Values Used* *SOLR/Lucene version: *4.2.1* *JVM version: Java(TM) SE Runtime Environment (build 1.7.0_07-b11) Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode) *Indexer startup command: set JVMARGS= -XX:MaxPermSize=364m -Xss256K –Xmx128m –Xms128m java " %JVMARGS% ^ -Dcom.sun.management.jmxremote.port=1092 ^ -Dcom.sun.management.jmxremote.ssl=false ^ -Dcom.sun.management.jmxremote.authenticate=false ^ -jar start.jar *SOLR indexing HTTP parameters request: webapp=/solr path=/dataimport params={clean=false&command=full-import&wt=javabin&version=2} The information we use for the database retrieve using the Data Import Handler is as follows: <dataSource name="org_only" type="JdbcDataSource" driver="oracle.jdbc.OracleDriver" url="jdbc:oracle:thin:@{server name}:1521:{database name}" user="{username}" password="{password}" readOnly="false" /> *The Query (simple, single table)* *select* *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(100)), 'null')* *as SOLR_ID,* *'STU.ACCT_ADDRESS_ALL'* *as SOLR_CATEGORY,* *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(255)), ' ') as ADDRESSALLRID,* *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_TYPE as varchar2(255)), ' ') as ADDRESSALLADDRTYPECD,* *NVL(cast(STU.ACCT_ADDRESS_ALL.LONGITUDE as varchar2(255)), ' ') as ADDRESSALLLONGITUDE,* *NVL(cast(STU.ACCT_ADDRESS_ALL.LATITUDE as varchar2(255)), ' ') as ADDRESSALLLATITUDE,* *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_NAME as varchar2(255)), ' ') as ADDRESSALLADDRNAME,* *NVL(cast(STU.ACCT_ADDRESS_ALL.CITY as varchar2(255)), ' ') as ADDRESSALLCITY,* *NVL(cast(STU.ACCT_ADDRESS_ALL.STATE as varchar2(255)), ' ') as ADDRESSALLSTATE,* *NVL(cast(STU.ACCT_ADDRESS_ALL.EMAIL_ADDR as varchar2(255)), ' ') as ADDRESSALLEMAILADDR * *from STU.ACCT_ADDRESS_ALL* You can see this information in the database.xml file. Our main solrconfig.xml file contains the following differences compared to a new downloaded solrconfig.xml file<file:///D:/Solr%20Full%20Indexing%20issue/solrconfig%20(default%20content).xml>(the original content). <config> <lib dir="../../../dist/" regex="solr-dataimporthandler-.*\.jar" /> <!—Our libraries containing customized filters--> <lib path="../../../../default/lib/common.jar" /> <lib path="../../../../default/lib/webapp.jar" /> <lib path="../../../../default/lib/commons-pool-1.4.jar" /> <abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError> <directoryFactory name="DirectoryFactory" class="org.apache.solr.core.StandardDirectoryFactory" /> <requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">database.xml</str> </lst> </requestHandler> </config> *Custom Libraries* The common.jar contains a customized TokenFiltersFactory implementation that we use for indexing. They do some special treatment to the fields read from the database. How those classes are used is described in the schema.xml file. The webapp.jar file contains other related classes. The commons-pool-1.4.jar is an API from apache used for instances reuse. The logic used in the TokenFiltersFactory is contained in the following files: ConcatFilterFactory.java<file:///D:/Solr%20Full%20Indexing%20issue/source%20files/ConcatFilterFactory.java> ConcatFilter.java<file:///D:/Solr%20Full%20Indexing%20issue/source%20files/ConcatFilter.java> MDFilterSchemaFactory.java<file:///D:/Solr%20Full%20Indexing%20issue/source%20files/MDFilterSchemaFactory.java> MDFilter.java<file:///D:/Solr%20Full%20Indexing%20issue/source%20files/MDFilter.java> MDFilterPoolObjectFactory.java<file:///D:/Solr%20Full%20Indexing%20issue/source%20files/MDFilterPoolObjectFactory.java> NullValueFilterFactory.java<file:///D:/Solr%20Full%20Indexing%20issue/source%20files/NullValueFilterFactory.java> NullValueFilter.java<file:///D:/Solr%20Full%20Indexing%20issue/source%20files/NullValueFilter.java> How we use them is described in the schema.xml file. We have been experimenting with the following configuration values: maxIndexingThreads ramBufferSizeMB maxBufferedDocs mergePolicy maxMergeAtOnce segmentsPerTier maxMergedSegmentMB autoCommit maxDocs maxTime autoSoftCommit maxTime Using numerous combinations of these values, the indexing fails. *IMPORTANT NOTE* When we disable all of the copyfield tags contained in the schema.xml file, or all but relatively few, the indexing completes successfully (see Test Case 1). *TEST CASES* All of the test cases have been analyzed with the Visual VM tool. All SOLR configuration files and indexer log content are in the test case directories included in a zip file. We have included the most relevant screenshots. Test Case 2 is the only one that includes the thread dump. *Test Case 1 * JVM arguments = -XX:MaxPermSize=364m -Xss256K –Xmx512m –Xms512m Results: Indexing status: Completed Time taken: 1:8:32.519 Error detail: NO ERROR. Index data directory size = 995 MB *Test Case 2* JVM arguments = -XX:MaxPermSize=364m -Xss256K –Xmx512m –Xms512m Results: Indexing status: failed Time taken: 1:30 hours Error detail: GC overhead limit exceeded (details see log file <file:///D:/Solr%20Full%20Indexing%20issue/test%20case%201/log.txt>) Index data directory size = 3.48 GB Memory Dump file : heapdump-1397500800627.hprof<file:///D:/Solr%20Full%20Indexing%20issue/test%20case%201/heapdump-1397500800627.hprof> The indexer was running with the javamelody profiling tool configured but the visual VM is where the crash was captured. I mention this because there are some messages in the log referring to this tool. In the following picture, you can see that the CPU processing increases before the crash. [image: overview] In the following figure we can see the memory used per thread. [image: thread dump - summary] [image: threads summary] *Test Case 3* JVM arguments = -XX:MaxPermSize=364m -Xss256K –Xmx128m –Xms128m Results: Indexing status: failed (ROLLBACK see in log). Error detail: java.lang.OutOfMemoryError: Java heap space (details see log file<file:///D:/Solr%20Full%20Indexing%20issue/test%20case%202/log.txt>) Time taken: 20 minutes Index data directory size = 647MB Memory Dump file : heapdump-1397579134605.hprof<file:///D:/Solr%20Full%20Indexing%20issue/test%20case%202/heapdump-1397579134605.hprof> *Test Case 4 (default solrconfig with 1 copyfield tag in the schema.xml)* Results: Indexing status: *Completed successful.* Error detail: No Errors. Time taken: 1:19:54 Index data directory size = 929MB *Test Case 5 (default solrconfig with 2 copyfield tags in the schema.xml)* Results: Indexing status: Completed successful. Error detail: No Errors. Time taken: 1:8:52.291 Index data directory size = 1.81 GB *Test Case 6 (default solrconfig with 3 copyfield tags in the schema.xml)* Results: Indexing status: Completed successful. Error detail: No Errors. Time taken: 1:0:52.624 Index data directory size = 953 MB *Test Case 7 (default solrconfig with 18 copyfield tags in the schema.xml)* Results: Indexing status: Fail (Rollback). Error detail: java.lang.OutOfMemoryError: Java heap space. Time taken: 43 minutes approx. Index data directory size = 1.30 GB *Test Case 8 (default solrconfig with 16 copyfield tags in the schema.xml)* Results: Indexing status: Completed Error detail: No error. Time taken: 1:10:48.641 Index data directory size = 2.11 GB *Test Case 9 (default solrconfig with 15 copyfield tags in the schema.xml but including the percentmatchwordonly+soundex)* Results: Indexing status: Fail (Rolback) Error detail: java.lang.OutOfMemoryError: Java heap space Time taken: 1:04 approx. Index data directory size = 1.90 GB *Test Case 9 (default solrconfig with 2 copyfield tags in the schema.xml but including the percentmatchwordonly_abc + percentmatchanyword.soundex_abc)* Results: Indexing status: Completed Error detail: none Time taken: 1:04 approx. Index data directory size = 0.99 GB The graphic is not included because it looks very similar to the others. *Conclusions* There are two threads consuming more memory. We saw these values in the Per Thread Allocations in the Java Visual VM: commitScheduler Thread Lucene Merge Thread Note how the ‘Allocated Bytes/sec’ needed exceeds the configured heap size. We understand that the memory shown in that column doesn’t mean that all of that memory exists in the heap but it appears that Solr or Lucene can handle more memory that what is assigned to the process. With 128MB of heap size configured, there were times when the commitScheduler thread was allocating 400MB while at other times the Lucene Merge was allocating around 1.5 GB of memory. So why the heap memory problem? At the time the memory error was thrown, both thread requests weren’t at their maximum memory consumption.