Thanks for the explanation Shawn. Very helpful. I think I got misled by the JavaDoc text for *ConcurrentUpdateSolrClient.Builder.withQueueSize* /** * The number of documents to batch together before sending to Solr. If not set, this defaults to 10. */ public Builder withQueueSize(int queueSize) { if (queueSize <= 0) { throw new IllegalArgumentException("queueSize must be a positive integer."); } this.queueSize = queueSize; return this; }
On Thu, Feb 22, 2018 at 9:41 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 2/21/2018 7:41 AM, Santosh Narayan wrote: > > May be it is my understanding of the documentation. As per the > > JavaDoc, ConcurrentUpdateSolrClient > > buffers all added documents and writes them into open HTTP connections. > > > > So I thought that this class would buffer documents in the client side > > itself till the QueueSize is reached and then send all the cached > documents > > together in one HTTP request. Is this not the case? > > That's not how it's designed. > > What ConcurrentUpdateSolrClient does differently than HttpSolrClient or > CloudSolrClient is return control immediately to your program when you > send an update, and begin processing that update in the background. If > you send a LOT of updates very quickly, then the queue will get larger, > and will typically be processed in parallel by multiple threads. The > client won't wait for the queue to fill. Processing of the first update > you send should begin right after you add it. > > Something to consider: Because control is returned to your program > immediately, and the response is always a success, your program will > never be informed about any problems with your adds when you use the > concurrent client. The concurrent client is a great choice for initial > bulk indexing, because it offers multi-threaded indexing without any > need to handle the threads yourself. But you don't get any kind of > error handling. > > Thanks, > Shawn > >