My apologies Santosh. I added that comment a few releases back based on a misunderstanding I've only recently been disabused of. I will correct it.
Anyway, Shawn's explanation above is correct. The queueSize parameter doesn't control batching, as he clarified. Sorry for the trouble. Best, Jason On Wed, Feb 21, 2018 at 8:50 PM, Santosh Narayan <santosh.narayan....@gmail.com> wrote: > Thanks for the explanation Shawn. Very helpful. I think I got misled by the > JavaDoc text for > *ConcurrentUpdateSolrClient.Builder.withQueueSize* > /** > * The number of documents to batch together before sending to Solr. If > not set, this defaults to 10. > */ > public Builder withQueueSize(int queueSize) { > if (queueSize <= 0) { > throw new IllegalArgumentException("queueSize must be a positive > integer."); > } > this.queueSize = queueSize; > return this; > } > > > > On Thu, Feb 22, 2018 at 9:41 AM, Shawn Heisey <apa...@elyograg.org> wrote: > >> On 2/21/2018 7:41 AM, Santosh Narayan wrote: >> > May be it is my understanding of the documentation. As per the >> > JavaDoc, ConcurrentUpdateSolrClient >> > buffers all added documents and writes them into open HTTP connections. >> > >> > So I thought that this class would buffer documents in the client side >> > itself till the QueueSize is reached and then send all the cached >> documents >> > together in one HTTP request. Is this not the case? >> >> That's not how it's designed. >> >> What ConcurrentUpdateSolrClient does differently than HttpSolrClient or >> CloudSolrClient is return control immediately to your program when you >> send an update, and begin processing that update in the background. If >> you send a LOT of updates very quickly, then the queue will get larger, >> and will typically be processed in parallel by multiple threads. The >> client won't wait for the queue to fill. Processing of the first update >> you send should begin right after you add it. >> >> Something to consider: Because control is returned to your program >> immediately, and the response is always a success, your program will >> never be informed about any problems with your adds when you use the >> concurrent client. The concurrent client is a great choice for initial >> bulk indexing, because it offers multi-threaded indexing without any >> need to handle the threads yourself. But you don't get any kind of >> error handling. >> >> Thanks, >> Shawn >> >>