There are some significant throughput improvements when you batch up a bunch of docs to Solr (assuming SolrJ). You can go ahead and send, say, 1,000 docs in a batch and if the batch fails, re-process the list to find the bad doc.
But as Jack says, Solr could do better here. Best, Erick On Sat, Jan 10, 2015 at 3:46 AM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > Sending individual documents will give you absolute control - just make > sure not to "commit" on each document sent since that would really slow > down indexing. > > You could also send smaller batches, life 5 to 20 documents to balance > between fine control and performance. It also depends on your document size > - small documents should be collected into larger batches, but large > documents should be sent in smaller batches. Sending a total of 2K to 20K > of bytes of data at a time is probably a good target. Smaller than 2K > incurs more overhead, and more than 50K or 100K may simply overload the > server rather than optimize performance. > > -- Jack Krupansky > > On Sat, Jan 10, 2015 at 6:02 AM, SolrUser1543 <osta...@gmail.com> wrote: > >> Would it be a good solution to index single document instead of bulk ? >> In this case I will know about the status of each message . >> >> What is recommendation in this case : Bulk vs Single ? >> >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/ignoring-bad-documents-during-index-tp4176947p4178546.html >> Sent from the Solr - User mailing list archive at Nabble.com. >>