I absolutely agree and I even read the NRT page before posting this question.
The thing that baffles me is this: Doing a commit after each add kills the performance. On the other hand, when I use commit within and specify an (absurd) 1ms delay,- I expect that this behavior will be equivalent to making a commit- from a functional perspective. Seeing that there is no magic in the world, I am trying to understand what is the price I am actually paying when using the commitWithin feature, on the one hand it commits almost immediately, on the other hand, it performs wonderfully. Where is the catch? -----Original Message----- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: יום ד 12 פברואר 2014 17:00 To: solr-user Subject: Re: Solr perfromance with commitWithin seesm too good to be true. I am afraid I am missing something Doing a standard commit after every document is a Solr anti-pattern. commitWithin is a “near-realtime” commit in recent versions of Solr and not a standard commit. https://cwiki.apache.org/confluence/display/solr/Near+Real+Time+Searching - Mark http://about.me/markrmiller On Feb 12, 2014, at 9:52 AM, Pisarev, Vitaliy <vitaliy.pisa...@hp.com> wrote: > I am running a very simple performance experiment where I post 2000 documents > to my application. Who in turn persists them to a relational DB and sends > them to Solr for indexing (Synchronously, in the same request). > I am testing 3 use cases: > > 1. No indexing at all - ~45 sec to post 2000 documents 2. Indexing > included - commit after each add. ~8 minutes (!) to post and index > 2000 documents 3. Indexing included - commitWithin 1ms ~55 seconds > (!) to post and index 2000 documents The 3rd result does not make any sense, > I would expect the behavior to be similar to the one in point 2. At first I > thought that the documents were not really committed but I could actually see > them being added by executing some queries during the experiment (via the > solr web UI). > I am worried that I am missing something very big. The code I use for point 2: > SolrInputDocument = // get doc > SolrServer solrConnection = // get connection solrConnection.add(doc); > solrConnection.commit(); Whereas the code for point 3: > SolrInputDocument = // get doc > SolrServer solrConnection = // get connection solrConnection.add(doc, > 1); // According to API documentation I understand there is no need to > explicitly call commit with this API Is it possible that committing after > each add will degrade performance by a factor of 40? >