Typically, people set their autocommit (hard) settings in solrconfig.xml and forget about it. I usually use a time-based trigger and don’t use documents as a trigger.
If you were waiting until the end of your batch run (all 46M docs) to issue a commit, that’s an anit-pattern. Until you do a hard commit, all the incoming documents are held in the transaction log, see: https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ <https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/>. Setting the autocommit settings to, say, 15 seconds should give a flatter response time. The Solr mailing list archives, see: http://lucene.apache.org/solr/community.html#mailing-lists-irc <http://lucene.apache.org/solr/community.html#mailing-lists-irc> Best, Erick > On Feb 18, 2019, at 10:03 AM, David '-1' Schmid <gdk...@gmail.com> wrote: > > Hello! > > Another question I could not find an answer to: > is there a best-practice / recommendation for pushing several million > documents into a new index? > > I'm currently splittig my documents into batches of 10,000 json-line > payloads into the update request handler, with commit set to 'true' > (yes, for each of the batches). > I'm using commit since that got me stable 'QTime' around ~2100, without > commiting every batch, the QTime will degrade ten-fold by the time I > sent somewhere around 1,000,000 documents. > This will steadily climb, so after I sent all 46M documents I end up > with QTime values about 40,000 in case I don't commit every batch > immediately. > > Since I cannot find anything in my mails, I wanted to search the > solr-user archives but, as far as I can tell: there is no such thing. > Maybe I can't see it or just glossed over it, but is there no searchable > index of solr-user? Any hints? > > regards, > -1