Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

Santosh Narayan Wed, 21 Feb 2018 17:51:03 -0800

Thanks for the explanation Shawn. Very helpful. I think I got misled by the
JavaDoc text for
*ConcurrentUpdateSolrClient.Builder.withQueueSize*
    /**
     * The number of documents to batch together before sending to Solr. If
not set, this defaults to 10.
     */
    public Builder withQueueSize(int queueSize) {
      if (queueSize <= 0) {
        throw new IllegalArgumentException("queueSize must be a positive
integer.");
      }
      this.queueSize = queueSize;
      return this;
    }




On Thu, Feb 22, 2018 at 9:41 AM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 2/21/2018 7:41 AM, Santosh Narayan wrote:
> > May be it is my understanding of the documentation. As per the
> > JavaDoc, ConcurrentUpdateSolrClient
> > buffers all added documents and writes them into open HTTP connections.
> >
> > So I thought that this class would buffer documents in the client side
> > itself till the QueueSize is reached and then send all the cached
> documents
> > together in one HTTP request. Is this not the case?
>
> That's not how it's designed.
>
> What ConcurrentUpdateSolrClient does differently than HttpSolrClient or
> CloudSolrClient is return control immediately to your program when you
> send an update, and begin processing that update in the background.  If
> you send a LOT of updates very quickly, then the queue will get larger,
> and will typically be processed in parallel by multiple threads.  The
> client won't wait for the queue to fill.  Processing of the first update
> you send should begin right after you add it.
>
> Something to consider:  Because control is returned to your program
> immediately, and the response is always a success, your program will
> never be informed about any problems with your adds when you use the
> concurrent client.  The concurrent client is a great choice for initial
> bulk indexing, because it offers multi-threaded indexing without any
> need to handle the threads yourself.  But you don't get any kind of
> error handling.
>
> Thanks,
> Shawn
>
>

Re: Solrj : ConcurrentUpdateSolrClient based on QueueSize and Time

Reply via email to