I absolutely agree and I even read the NRT page before posting this question.

The thing that baffles me is this:

Doing a commit after each add kills the performance.
On the other hand, when I use commit within and specify an (absurd) 1ms delay,- 
I expect that this behavior will be equivalent to making a commit- from a 
functional perspective.

Seeing that there is no magic in the world, I am trying to understand what is 
the price I am actually paying when using the commitWithin feature, on the one 
hand it commits almost immediately, on the other hand, it performs wonderfully. 
Where is the catch?


-----Original Message-----
From: Mark Miller [mailto:markrmil...@gmail.com] 
Sent: יום ד 12 פברואר 2014 17:00
To: solr-user
Subject: Re: Solr perfromance with commitWithin seesm too good to be true. I am 
afraid I am missing something

Doing a standard commit after every document is a Solr anti-pattern.

commitWithin is a “near-realtime” commit in recent versions of Solr and not a 
standard commit.

https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching

- Mark

http://about.me/markrmiller

On Feb 12, 2014, at 9:52 AM, Pisarev, Vitaliy <vitaliy.pisa...@hp.com> wrote:

> I am running a very simple performance experiment where I post 2000 documents 
> to my application. Who in turn persists them to a relational DB and sends 
> them to Solr for indexing (Synchronously, in the same request).
> I am testing 3 use cases:
> 
>  1.  No indexing at all - ~45 sec to post 2000 documents  2.  Indexing 
> included - commit after each add. ~8 minutes (!) to post and index 
> 2000 documents  3.  Indexing included - commitWithin 1ms ~55 seconds 
> (!) to post and index 2000 documents The 3rd result does not make any sense, 
> I would expect the behavior to be similar to the one in point 2. At first I 
> thought that the documents were not really committed but I could actually see 
> them being added by executing some queries during the experiment (via the 
> solr web UI).
> I am worried that I am missing something very big. The code I use for point 2:
> SolrInputDocument = // get doc
> SolrServer solrConnection = // get connection solrConnection.add(doc); 
> solrConnection.commit(); Whereas the code for point 3:
> SolrInputDocument = // get doc
> SolrServer solrConnection = // get connection solrConnection.add(doc, 
> 1); // According to API documentation I understand there is no need to 
> explicitly call commit with this API Is it possible that committing after 
> each add will degrade performance by a factor of 40?
> 

Reply via email to