Hello ,
             I am using add() method which receives Collection of
SolrInputDocuments instead of add() which receives a single document.
I am afraid, is sending a group of documents being called as
"batching" in Solr terminology? . If yes, then I am doing it ( by
including additional logic in my code ). But the main point I dont get
is how big a batch could be? How to find most suitable number of
SolrDocs that could be sent at a time.
         Also, In case If I go for multi-threaded commons, then the
number of threads to be used is equal to N of "N"-core processor, for
being  optimal? .
Thanks.

2010/1/12 Yonik Seeley <yo...@lucidimagination.com>:
> On Tue, Jan 12, 2010 at 3:48 AM, Smith G <gudumba.sm...@gmail.com> wrote:
>> Hello All,
>>               I am trying to find a better approach ( perfomance wise
>> ) to index documents. Document count is approximately a million+.
>> First, I thought of writing multiple threads using
>> CommonsHttpSolrServer to submit documents. But later I found out
>> StreamingUpdateSolrServer, which says we can forget about batching.
>>
>> 1) We can pass thread-count parameter to StreamingUpdateSolrServer,
>> does it exactly serve the same as writing multiple threads using
>> CommonsHttpSolrServer ?.
>
> Not quite - streaming update solr server batches documents on the fly.
>  So if you have a server with N CPUs, you should only need N threads
> to saturate it.  Using multiple threads with CommonsHttpSolrServer,
> it's still one document per request (unless you do your own batching)
> and there is still latency between request and response, meaning it
> would take more threads to fill in that latency.
>
> -Yonik
> http://www.lucidimagination.com
>

Reply via email to