Hi,

I am facing issues with DIH fullimport,

I have a database with 3 million records that will translate into index size
of 6GB.

When I am trying to do full import I am getting out of memory error like :

INFO: Starting Full Import
May 10, 2010 11:44:06 PM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
WARNING: Unable to read: dataimport.properties
May 10, 2010 11:44:06 PM org.apache.solr.update.DirectUpdateHandler2
deleteAll
INFO: [] REMOVING ALL DOCUMENTS FROM INDEX
May 10, 2010 11:44:06 PM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=1
commit{dir=/home/search/SOLR/solr/data/index,segFN=segments_1,version=1273549043650,generation=1,filenames=[segments_1]
May 10, 2010 11:44:06 PM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 1273549043650
May 10, 2010 11:44:06 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
call
INFO: Creating a connection for entity offer with URL:
jdbc:mysql://domU-12-31-39-10-59-01.compute-1.internal/jounce1
May 10, 2010 11:44:07 PM org.apache.solr.handler.dataimport.JdbcDataSource$1
call
INFO: Time taken for getConnection(): 301



Exception in thread "Timer-1" java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.newValueIterator(HashMap.java:843)
at java.util.HashMap$Values.iterator(HashMap.java:910)
at
org.mortbay.jetty.servlet.HashSessionManager.scavenge(HashSessionManager.java:180)
at
org.mortbay.jetty.servlet.HashSessionManager.access$000(HashSessionManager.java:36)
at
org.mortbay.jetty.servlet.HashSessionManager$1.run(HashSessionManager.java:144)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
May 10, 2010 11:54:54 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
SEVERE: Full Import failed
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.OutOfMemoryError: Java heap space
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:424)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370)
Caused by: java.lang.OutOfMemoryError: Java heap space
at com.mysql.jdbc.MysqlIO.nextRowFast(MysqlIO.java:1621)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1398)
at com.mysql.jdbc.MysqlIO.readSingleRowSet(MysqlIO.java:2816)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:467)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:2510)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1746)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2135)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2536)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2465)
at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:734)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.<init>(JdbcDataSource.java:246)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:71)
at
org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:357)
... 5 more
May 10, 2010 11:54:54 PM org.apache.solr.update.DirectUpdateHandler2
rollback
INFO: start rollback
May 10, 2010 11:54:54 PM org.apache.solr.update.DirectUpdateHandler2
rollback
INFO: end_rollback




I tried allocating 4 Gigs of memory to the VM but no luck.
Are the records cached before indexing or streamed?
any pointers to documents?

thanks in anticipation,
umar

Reply via email to