Typically, people set their autocommit (hard) settings in solrconfig.xml and 
forget about it. I usually use a time-based trigger and don’t use documents as 
a trigger.

If you were waiting until the end of your batch run (all 46M docs) to issue a 
commit, that’s an anit-pattern. Until you do a hard commit, all the incoming 
documents are held in the transaction log, see: 
https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
 
<https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/>.
 Setting the autocommit settings to, say, 15 seconds should give a flatter 
response time.

The Solr mailing list archives, see: 
http://lucene.apache.org/solr/community.html#mailing-lists-irc 
<http://lucene.apache.org/solr/community.html#mailing-lists-irc>

Best,
Erick

> On Feb 18, 2019, at 10:03 AM, David '-1' Schmid <gdk...@gmail.com> wrote:
> 
> Hello!
> 
> Another question I could not find an answer to:
> is there a best-practice / recommendation for pushing several million
> documents into a new index?
> 
> I'm currently splittig my documents into batches of 10,000 json-line
> payloads into the update request handler, with commit set to 'true'
> (yes, for each of the batches).
> I'm using commit since that got me stable 'QTime' around ~2100, without
> commiting every batch, the QTime will degrade ten-fold by the time I
> sent somewhere around 1,000,000 documents.
> This will steadily climb, so after I sent all 46M documents I end up
> with QTime values about 40,000 in case I don't commit every batch
> immediately.
> 
> Since I cannot find anything in my mails, I wanted to search the
> solr-user archives but, as far as I can tell: there is no such thing.
> Maybe I can't see it or just glossed over it, but is there no searchable
> index of solr-user? Any hints?
> 
> regards,
> -1

Reply via email to