Hi all, I am looking to improve indexing speed when loading many documents as part of an import. I am using the SolrJ-Client and currently I add the documents one-by-one using HttpSolrClient and its method add(SolrInputDocument doc, int commitWithinMs).
My first step would be to change that to use add(Collection<SolrInputDocument> docs, int commitWithinMs) instead, which I expect would already improve performance. Does it matter which method I use? Beside the method taking a Collection<SolrInputDocument> there is also one that takes an Iterator<SolrInputDocument> ... and what about ConcurrentUpdateSolrClient? Should I use it for bulk indexing instead of HttpSolrClient? Currently we are on version 5.5.0 of solr, and we don't run SolrCloud, i.e. only one instance etc. Indexing 39657 documents (which result in a core size of appr. 127MB) took about 10 minutes with the one-by-one approach. Best regards and thanks for any suggestions, Sebastian Riemer