This is a bug in MySQL. Try setting the Fetch Size the Statement on
the connection to Integer.MIN_VALUE.
See http://forums.mysql.com/read.php?39,137457 amongst a host of other
discussions on the subject. Basically, it tries to load all the rows
into memory, the only alternative is to set the fetch size to
Integer.MIN_VALUE so that it gets it one row at a time. I've hit this
one myself and it isn't caused by the DataImportHandler, but by the
MySQL JDBC handler.
-Grant
On Jun 24, 2008, at 8:23 PM, wojtekpia wrote:
I'm trying to load ~10 million records into Solr using the
DataImportHandler.
I'm running out of memory (java.lang.OutOfMemoryError: Java heap
space) as
soon as I try loading more than about 5 million records.
Here's my configuration:
I'm connecting to a SQL Server database using the sqljdbc driver.
I've given
my Solr instance 1.5 GB of memory. I have set the dataSource
batchSize to
10000. My SQL query is "select top XXX field1, ... from table1". I
have
about 40 fields in my Solr schema.
I thought the DataImportHandler would stream data from the DB rather
than
loading it all into memory at once. Is that not the case? Any
thoughts on
how to get around this (aside from getting a machine with more
memory)?
--
View this message in context:
http://www.nabble.com/DataImportHandler-running-out-of-memory-tp18102644p18102644.html
Sent from the Solr - User mailing list archive at Nabble.com.
--------------------------
Grant Ingersoll
http://www.lucidimagination.com
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ