Hello , I am using add() method which receives Collection of SolrInputDocuments instead of add() which receives a single document. I am afraid, is sending a group of documents being called as "batching" in Solr terminology? . If yes, then I am doing it ( by including additional logic in my code ). But the main point I dont get is how big a batch could be? How to find most suitable number of SolrDocs that could be sent at a time. Also, In case If I go for multi-threaded commons, then the number of threads to be used is equal to N of "N"-core processor, for being optimal? . Thanks.
2010/1/12 Yonik Seeley <yo...@lucidimagination.com>: > On Tue, Jan 12, 2010 at 3:48 AM, Smith G <gudumba.sm...@gmail.com> wrote: >> Hello All, >> I am trying to find a better approach ( perfomance wise >> ) to index documents. Document count is approximately a million+. >> First, I thought of writing multiple threads using >> CommonsHttpSolrServer to submit documents. But later I found out >> StreamingUpdateSolrServer, which says we can forget about batching. >> >> 1) We can pass thread-count parameter to StreamingUpdateSolrServer, >> does it exactly serve the same as writing multiple threads using >> CommonsHttpSolrServer ?. > > Not quite - streaming update solr server batches documents on the fly. > So if you have a server with N CPUs, you should only need N threads > to saturate it. Using multiple threads with CommonsHttpSolrServer, > it's still one document per request (unless you do your own batching) > and there is still latency between request and response, meaning it > would take more threads to fill in that latency. > > -Yonik > http://www.lucidimagination.com >