DIH does not modify SQL. This value is used as a connection property --Noble
On Wed, Jun 25, 2008 at 4:40 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > I'm assuming, of course, that the DIH doesn't automatically modify the SQL > statement according to the batch size. > > -Grant > > On Jun 25, 2008, at 7:05 AM, Grant Ingersoll wrote: > >> I think it's a bit different. I ran into this exact problem about two >> weeks ago on a 13 million record DB. MySQL doesn't honor the fetch size for >> it's v5 JDBC driver. >> >> See >> http://www.databasesandlife.com/reading-row-by-row-into-java-from-mysql/ or >> do a search for MySQL fetch size. >> >> You actually have to do setFetchSize(Integer.MIN_VALUE) (-1 doesn't work) >> in order to get streaming in MySQL. >> >> -Grant >> >> >> On Jun 24, 2008, at 10:35 PM, Shalin Shekhar Mangar wrote: >> >>> Setting the batchSize to 10000 would mean that the Jdbc driver will keep >>> 10000 rows in memory *for each entity* which uses that data source (if >>> correctly implemented by the driver). Not sure how well the Sql Server >>> driver implements this. Also keep in mind that Solr also needs memory to >>> index documents. You can probably try setting the batch size to a lower >>> value. >>> >>> The regular memory tuning stuff should apply here too -- try disabling >>> autoCommit and turn-off autowarming and see if it helps. >>> >>> On Wed, Jun 25, 2008 at 5:53 AM, wojtekpia <[EMAIL PROTECTED]> wrote: >>> >>>> >>>> I'm trying to load ~10 million records into Solr using the >>>> DataImportHandler. >>>> I'm running out of memory (java.lang.OutOfMemoryError: Java heap space) >>>> as >>>> soon as I try loading more than about 5 million records. >>>> >>>> Here's my configuration: >>>> I'm connecting to a SQL Server database using the sqljdbc driver. I've >>>> given >>>> my Solr instance 1.5 GB of memory. I have set the dataSource batchSize >>>> to >>>> 10000. My SQL query is "select top XXX field1, ... from table1". I have >>>> about 40 fields in my Solr schema. >>>> >>>> I thought the DataImportHandler would stream data from the DB rather >>>> than >>>> loading it all into memory at once. Is that not the case? Any thoughts >>>> on >>>> how to get around this (aside from getting a machine with more memory)? >>>> >>>> -- >>>> View this message in context: >>>> >>>> http://www.nabble.com/DataImportHandler-running-out-of-memory-tp18102644p18102644.html >>>> Sent from the Solr - User mailing list archive at Nabble.com. >>>> >>>> >>> >>> >>> -- >>> Regards, >>> Shalin Shekhar Mangar. >> >> -------------------------- >> Grant Ingersoll >> http://www.lucidimagination.com >> >> Lucene Helpful Hints: >> http://wiki.apache.org/lucene-java/BasicsOfPerformance >> http://wiki.apache.org/lucene-java/LuceneFAQ >> >> >> >> >> >> >> > > -------------------------- > Grant Ingersoll > http://www.lucidimagination.com > > Lucene Helpful Hints: > http://wiki.apache.org/lucene-java/BasicsOfPerformance > http://wiki.apache.org/lucene-java/LuceneFAQ > > > > > > > > -- --Noble Paul