We have tried using fetchSize and we still got the same out of memory
errors.


On Fri, Apr 18, 2014 at 9:39 PM, Shawn Heisey <s...@elyograg.org> wrote:

> On 4/18/2014 6:15 PM, Candygram For Mongo wrote:
> > We are getting Out Of Memory errors when we try to execute a full import
> > using the Data Import Handler.  This error originally occurred on a
> > production environment with a database containing 27 million records.
>  Heap
> > memory was configured for 6GB and the server had 32GB of physical memory.
> >  We have been able to replicate the error on a local system with 6
> million
> > records.  We set the memory heap size to 64MB to accelerate the error
> > replication.  The indexing process has been failing in different
> scenarios.
> >  We have 9 test cases documented.  In some of the test cases we increased
> > the heap size to 128MB.  In our first test case we set heap memory to
> 512MB
> > which also failed.
>
> One characteristic of a JDBC connection is that unless you tell it
> otherwise, it will try to retrieve the entire resultset into RAM before
> any results are delivered to the application.  It's not Solr doing this,
> it's JDBC.
>
> In this case, there are 27 million rows in the resultset.  It's highly
> unlikely that this much data (along with the rest of Solr's memory
> requirements) will fit in 6GB of heap.
>
> JDBC has a built-in way to deal with this.  It's called fetchSize.  By
> using the batchSize parameter on your JdbcDataSource config, you can set
> the JDBC fetchSize.  Set it to something small, between 100 and 1000,
> and you'll probably get rid of the OOM problem.
>
> http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource
>
> If you had been using MySQL, I would have recommended that you set
> batchSize to -1.  This sets fetchSize to Integer.MIN_VALUE, which tells
> the MySQL driver to stream results instead of trying to either batch
> them or return everything.  I'm pretty sure that the Oracle driver
> doesn't work this way -- you would have to modify the dataimport source
> code to use their streaming method.
>
> Thanks,
> Shawn
>
>

Reply via email to