Re: Details on why ConccurentUpdateSolrServer is reccommended for maximum index performance

Erick Erickson Wed, 10 Dec 2014 15:12:48 -0800

The process if you don't use CUSS is this:
1> assemble the packet of docs
2> send it to Solr
3> wait until Solr is done indexing it
4> start assembling the second doc.


So, several things are going on here.
1> the client is sitting idle while Solr
     is indexing
and
2> Solr is sitting idle when the client is
    assembling the next doc.

So CUSS will do something like this:
1> assemble a packet for Solr
2> pass off the actual transmission
     to Solr to a thread and immediately
     go back to <1>.

Basically, CUSS is doing async processing.

The parameters to the CUSS constructor
govern how many threads can be sending
docs to Solr and how many packets are
queued up for each of those threads.

You know you've reached your limit when
you see your Solr instance CPU stay high,
I usually look for the 90+ %.


Hope that helps,
Erick


On Wed, Dec 10, 2014 at 1:12 PM, Tom Burton-West <tburt...@umich.edu> wrote:
> Hello all,
>
> In the example schema.xml for Solr 4.10.2 this comment is listed under the
> "PERFORMANCE NOTE"
>
> "For maximum indexing performance, use the ConcurrentUpdateSolrServer
>     java client."
>
> Is there some documentation somewhere that explains why this will maximize
> indexing peformance?
>
> In particular, I have very large documents on the order of 700KB, so I'[m
> interested to determine if there is a significant advantage to using the
> ConcurrentUpdateSolrServer in my use case.
>
> A related question, is how to use ConcurrentUpdateSolrServer with XML
> documents
>
> I have very large XML documents, and the examples I see all build documents
> by adding fields in Java code.  Is there an example that actually reads XML
> files from the file system?
>
> Tom

Re: Details on why ConccurentUpdateSolrServer is reccommended for maximum index performance

Reply via email to