Does this help? http://wiki.apache.org/solr/DataImportHandlerFaq#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F
On Wed, Oct 28, 2009 at 12:38 AM, William Pierce <evalsi...@hotmail.com>wrote: > Hi, Gilbert: > > Thanks for your tip! I just tried it. Unfortunately, it does not work for > me. I still get the OOM exception. > > How large was your dataset? And what were your machine specs? > > Cheers, > > - Bill > > -------------------------------------------------- > From: "Gilbert Boyreau" <gboyr...@andevsol.com> > Sent: Tuesday, October 27, 2009 11:54 AM > To: <solr-user@lucene.apache.org> > Subject: Re: DIH out of memory exception > > > Hi, >> >> I got the same problem using DIH with a large dataset in MySql database. >> >> Following : >> http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-implementation-notes.html >> , >> and looking at the java code, it appears that DIH use PreparedStatement in >> the JdbcDataSource. >> >> I set the batchsize parameter to -1 and it solved my problem. >> >> Regards. >> Gilbert. >> >> William Pierce a écrit : >> >>> Folks: >>> >>> My db contains approx 6M records -- on average each is approx 1K bytes. >>> When I use the DIH, I reliably get an OOM exception. The machine has 4 GB >>> ram, my tomcat is set to use max heap of 2G. >>> The option of increasing memory is not tenable coz as the number of >>> documents grows I will be back in this situation. >>> Is there a way to batch the documents? I tried setting the batchsize >>> parameter to 500 on the <dataSource> tag where I specify the jdbc >>> parameters. This had no effect. >>> >>> Best, >>> >>> - Bill >>> >>> >> >>