Re: More heap usage in Solr during indexing

Shawn Heisey Mon, 17 Mar 2014 12:13:47 -0700

On 3/17/2014 12:39 PM, solr2020 wrote:

previously we faced OOM when we try to index 1.2M records at the same time.
Now we divided that into two chunks and indexing twice. So now we are not
getting OOM but heap usage is more. So we are analyzing and trying to find
the cause to make sure we shouldn't get OOM again.

How are you indexing? A previous message you sent to the mailing listindicates that your source is a DB table.

If that's true, can you share the dataSource section(s) from yourdataimport handler configuration? You might be running into a situationwhere DIH is retrieving the entire dataset via JDBC.

For a MySQL JDBC driver, you can avoid this with a batchSize parameterset to -1. This causes the JDBC driver to stream the results from theserver rather than read them into memory. Other JDBC drivers may needdifferent settings.


http://mysolr.com/tips/dataimporthandler-runs-out-of-memory-on-large-table/

Thanks,
Shawn

Re: More heap usage in Solr during indexing

Reply via email to