On 10/14/2012 5:45 PM, Erick Erickson wrote:
About your second point. Try committing more often with openSearcher
set to false.
There's a bit here:
http://wiki.apache.org/solr/SolrConfigXml

     <autoCommit>
       <maxDocs>10000</maxDocs> <!-- maximum uncommited docs before
autocommit triggered -->
       <maxTime>15000</maxTime> <!-- maximum time (in MS) after adding
a doc before an autocommit is triggered -->
       <openSearcher>false</openSearcher> <!-- SOLR 4.0.  Optionally
don't open a searcher on hard commit.  This is useful to minimize the
size of transaction logs that keep track of uncommitted updates. -->
     </autoCommit>


That should keep the size of the transaction log down to reasonable levels...

I have autocommit turned completely off -- both values set to zero. The DIH import from MySQL, over 12 million rows per shard, is done in one go on all my build cores at once, then I swap cores. It takes a little over three hours and produces a 22GB index. I have batchSize set to -1 so that jdbc streams the records.

When I first set this up back on 1.4.1, I had some kind of severe problem when autocommit was turned on. I can no longer remember what it caused, but it was a huge showstopper of some kind.

Thanks,
Shawn

Reply via email to