I have uploaded several files including the problem description with graphics to this link on Google drive:
https://drive.google.com/folderview?id=0B7UpFqsS5lSjWEhxRE1NN2tMNTQ&usp=sharing I shared it with this address "solr-user@lucene.apache.org" so I am hoping it can be accessed by people in the group. On Fri, Apr 18, 2014 at 5:15 PM, Candygram For Mongo < candygram.for.mo...@gmail.com> wrote: > I have lots of log files and other files to support this issue (sometimes > referenced in the text below) but I am not sure the best way to submit. I > don't want to overwhelm and I am not sure if this email will accept graphs > and charts. Please provide direction and I will send them. > > > *Issue Description* > > > > We are getting Out Of Memory errors when we try to execute a full import > using the Data Import Handler. This error originally occurred on a > production environment with a database containing 27 million records. Heap > memory was configured for 6GB and the server had 32GB of physical memory. > We have been able to replicate the error on a local system with 6 million > records. We set the memory heap size to 64MB to accelerate the error > replication. The indexing process has been failing in different scenarios. > We have 9 test cases documented. In some of the test cases we increased > the heap size to 128MB. In our first test case we set heap memory to 512MB > which also failed. > > > > > > *Environment Values Used* > > > > *SOLR/Lucene version: *4.2.1* > > *JVM version: > > Java(TM) SE Runtime Environment (build 1.7.0_07-b11) > > Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode) > > *Indexer startup command: > > set JVMARGS= -XX:MaxPermSize=364m -Xss256K –Xmx128m –Xms128m > > java " %JVMARGS% ^ > > -Dcom.sun.management.jmxremote.port=1092 ^ > > -Dcom.sun.management.jmxremote.ssl=false ^ > > -Dcom.sun.management.jmxremote.authenticate=false ^ > > -jar start.jar > > *SOLR indexing HTTP parameters request: > > webapp=/solr path=/dataimport > params={clean=false&command=full-import&wt=javabin&version=2} > > > > The information we use for the database retrieve using the Data Import > Handler is as follows: > > > > <dataSource > > name="org_only" > > type="JdbcDataSource" > > driver="oracle.jdbc.OracleDriver" > > url="jdbc:oracle:thin:@{server > name}:1521:{database name}" > > user="{username}" > > password="{password}" > > readOnly="false" > > /> > > > > > > *The Query (simple, single table)* > > > > *select* > > > > *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(100)), 'null')* > > *as SOLR_ID,* > > > > *'STU.ACCT_ADDRESS_ALL'* > > *as SOLR_CATEGORY,* > > > > *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(255)), ' ') as > ADDRESSALLRID,* > > *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_TYPE as varchar2(255)), ' ') as > ADDRESSALLADDRTYPECD,* > > *NVL(cast(STU.ACCT_ADDRESS_ALL.LONGITUDE as varchar2(255)), ' ') as > ADDRESSALLLONGITUDE,* > > *NVL(cast(STU.ACCT_ADDRESS_ALL.LATITUDE as varchar2(255)), ' ') as > ADDRESSALLLATITUDE,* > > *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_NAME as varchar2(255)), ' ') as > ADDRESSALLADDRNAME,* > > *NVL(cast(STU.ACCT_ADDRESS_ALL.CITY as varchar2(255)), ' ') as > ADDRESSALLCITY,* > > *NVL(cast(STU.ACCT_ADDRESS_ALL.STATE as varchar2(255)), ' ') as > ADDRESSALLSTATE,* > > *NVL(cast(STU.ACCT_ADDRESS_ALL.EMAIL_ADDR as varchar2(255)), ' ') as > ADDRESSALLEMAILADDR * > > > > *from STU.ACCT_ADDRESS_ALL* > > > > You can see this information in the database.xml file. > > > > Our main solrconfig.xml file contains the following differences compared > to a new downloaded solrconfig.xml file (the original content). > > > > <config> > > <lib dir="../../../dist/" > regex="solr-dataimporthandler-.*\.jar" /> > > <!—Our libraries containing customized filters--> > > <lib path="../../../../default/lib/common.jar" /> > > <lib path="../../../../default/lib/webapp.jar" /> > > <lib path="../../../../default/lib/commons-pool-1.4.jar" /> > > > > > <abortOnConfigurationError>${solr.abortOnConfigurationError:true}</abortOnConfigurationError> > > > > <directoryFactory name="DirectoryFactory" > class="org.apache.solr.core.StandardDirectoryFactory" /> > > > > <requestHandler name="/dataimport" > class="org.apache.solr.handler.dataimport.DataImportHandler"> > > <lst name="defaults"> > > <str name="config">database.xml</str> > > </lst> > > </requestHandler> > > </config> > > > > > > *Custom Libraries* > > > > The common.jar contains a customized TokenFiltersFactory implementation > that we use for indexing. They do some special treatment to the fields > read from the database. How those classes are used is described in the > schema.xml file. The webapp.jar file contains other related classes. > The commons-pool-1.4.jar is an API from apache used for instances reuse. > > > > The logic used in the TokenFiltersFactory is contained in the following > files: > > > > ConcatFilterFactory.java > > ConcatFilter.java > > MDFilterSchemaFactory.java > > MDFilter.java > > MDFilterPoolObjectFactory.java > > NullValueFilterFactory.java > > NullValueFilter.java > > > > How we use them is described in the schema.xml file. > > > > We have been experimenting with the following configuration values: > > > > maxIndexingThreads > > ramBufferSizeMB > > maxBufferedDocs > > mergePolicy > > maxMergeAtOnce > > segmentsPerTier > > maxMergedSegmentMB > > autoCommit > > maxDocs > > maxTime > > autoSoftCommit > > maxTime > > > > Using numerous combinations of these values, the indexing fails. > > > > > > *IMPORTANT NOTE* > > > > When we disable all of the copyfield tags contained in the schema.xml > file, or all but relatively few, the indexing completes successfully (see > Test Case 1). > > > > > > *TEST CASES* > > > > All of the test cases have been analyzed with the Visual VM tool. All > SOLR configuration files and indexer log content are in the test case > directories included in a zip file. We have included the most relevant > screenshots. Test Case 2 is the only one that includes the thread dump. > > > > > > *Test Case 1 * > > > > JVM arguments = -XX:MaxPermSize=364m -Xss256K –Xmx512m –Xms512m > > > > Results: > > Indexing status: Completed > > Time taken: 1:8:32.519 > > Error detail: NO ERROR. > > Index data directory size = 995 MB > > > > > > *Test Case 2* > > > > JVM arguments = -XX:MaxPermSize=364m -Xss256K –Xmx512m –Xms512m > > > > Results: > > Indexing status: failed > > Time taken: 1:30 hours > > Error detail: GC overhead limit exceeded (details see log > file) > > Index data directory size = 3.48 GB > > Memory Dump file : heapdump-1397500800627.hprof > > > > The indexer was running with the javamelody profiling tool configured but > the visual VM is where the crash was captured. I mention this because > there are some messages in the log referring to this tool. In the > following picture, you can see that the CPU processing increases before the > crash. > > > > > > [image: overview] > > > > > > In the following figure we can see the memory used per thread. > > > > > > > > [image: thread dump - summary] > > > > > > [image: threads summary] > > > > > > *Test Case 3* > > > > JVM arguments = -XX:MaxPermSize=364m -Xss256K –Xmx128m –Xms128m > > > > Results: > > Indexing status: failed (ROLLBACK see in log). > > Error detail: java.lang.OutOfMemoryError: Java heap > space (details see log file) > > Time taken: 20 minutes > > Index data directory size = 647MB > > Memory Dump file : heapdump-1397579134605.hprof > > > > > > > > > > > > > > > > > > *Test Case 4 (default solrconfig with 1 copyfield tag in the schema.xml)* > > > > Results: > > Indexing status: *Completed successful.* > > Error detail: No Errors. > > Time taken: 1:19:54 > > Index data directory size = 929MB > > > > > > > > *Test Case 5 (default solrconfig with 2 copyfield tags in the schema.xml)* > > > > Results: > > Indexing status: Completed successful. > > Error detail: No Errors. > > Time taken: 1:8:52.291 > > Index data directory size = 1.81 GB > > > > > > > > > > *Test Case 6 (default solrconfig with 3 copyfield tags in the schema.xml)* > > > > Results: > > Indexing status: Completed successful. > > Error detail: No Errors. > > Time taken: 1:0:52.624 > > Index data directory size = 953 MB > > > > > > > > *Test Case 7 (default solrconfig with 18 copyfield tags in the schema.xml)* > > > > Results: > > Indexing status: Fail (Rollback). > > Error detail: java.lang.OutOfMemoryError: Java heap > space. > > Time taken: 43 minutes approx. > > Index data directory size = 1.30 GB > > > > > > > > > > *Test Case 8 (default solrconfig with 16 copyfield tags in the schema.xml)* > > > > Results: > > Indexing status: Completed > > Error detail: No error. > > Time taken: 1:10:48.641 > > Index data directory size = 2.11 GB > > > > > > > > *Test Case 9 (default solrconfig with 15 copyfield tags in the schema.xml > but including the percentmatchwordonly+soundex)* > > > > Results: > > Indexing status: Fail (Rolback) > > Error detail: java.lang.OutOfMemoryError: Java heap space > > Time taken: 1:04 approx. > > Index data directory size = 1.90 GB > > > > > > *Test Case 9 (default solrconfig with 2 copyfield tags in the schema.xml > but including the percentmatchwordonly_abc + > percentmatchanyword.soundex_abc)* > > > > Results: > > Indexing status: Completed > > Error detail: none > > Time taken: 1:04 approx. > > Index data directory size = 0.99 GB > > > > The graphic is not included because it looks very similar to the others. > > > > *Conclusions* > > > > There are two threads consuming more memory. We saw these values in the > Per Thread Allocations in the Java Visual VM: > > > > commitScheduler Thread > > Lucene Merge Thread > > > > Note how the ‘Allocated Bytes/sec’ needed exceeds the configured heap > size. We understand that the memory shown in that column doesn’t mean that > all of that memory exists in the heap but it appears that Solr or Lucene > can handle more memory that what is assigned to the process. > > > > With 128MB of heap size configured, there were times when the > commitScheduler thread was allocating 400MB while at other times the Lucene > Merge was allocating around 1.5 GB of memory. So why the heap memory > problem? At the time the memory error was thrown, both thread requests > weren’t at their maximum memory consumption. >