Hi, I think Timothy is right about what Lisheng is really after, which is consistency.
I agree with what Timothy is implying here - changes of search being inconsistent are very, very small. I'm guessing Lisheng is trying to solve a problem he doesn't actually have yet? Also, think about a non-SolrCloud solution. What happens when a user pages through results? Typically that just re-runs the same query, but with a different page offset. What happens if between page 1 and page 2 the index changes and a searcher is reopened? Same sort of problem can happen, right? Yet, in a few hundred client engagements involving Solr or ElasticSearch I don't recall this ever being an issue. Otis -- Solr & ElasticSearch Support http://sematext.com/ On Thu, Apr 11, 2013 at 8:13 PM, Timothy Potter <thelabd...@gmail.com> wrote: > Hmmm ... I was following this discussion but then got confused when Lisheng > said to change Solr to "compromise consistency in order to increase > availability" when your concern is "how long replica is behind leader". > Seems you want more consistency vs. less in this case? One of the reasons > behind Solr's leader election approach is to achieve low-latency eventual > consistency (Mark's term from the linked to discussion). > > Un-committed docs are only visible if you use real-time get, in which case > the request is served by the shard leader (or replica) from its update log. > I suppose there's a chance of a few millis between the leader having the > request in its tlog and the replica having the doc it its tlog but that > seems like the nature of the beast. Meaning that Solr never promised to be > 100% consistent at millisecond granularity in a distributed model - any > small time-window between what a leader has and replica are probably > network latency which you should solve outside of Solr. I suspect you could > direct all your real-time get requests to leaders only using some smart > client like CloudSolrServer if it mattered that much. > > Otherwise, all other queries require the document to be committed to be > visible. I suppose there is a very small window when a new searcher is open > on the leader and the new searcher is not yet open on the replica. However, > with soft-commits, that too seems like a milli or two based on network > latency. > > @Shawn - yes, I've actually seen this work in my cluster. We lose replicas > from time-to-time and indexing keeps on trucking. > > > > > > On Thu, Apr 11, 2013 at 4:51 PM, Zhang, Lisheng < > lisheng.zh...@broadvision.com> wrote: > >> Hi Otis, >> >> Thanks very much for helps, your explanation is very clear. >> >> My main concern is not the return status for indexing calls (although >> which is >> also important), my main concern is how long replica is behind the leader >> (or >> putting in your way, how consistent search picture is to client A and B). >> >> Our application requires clients see same result whether he hits leader or >> replica, so it seems we do have a problem here. If no better solution I may >> consider to change solr4 a little (I have not read solr4x fully yet) to >> compromise >> consistency (C) in order to increase availability (A), on a high level do >> you see >> serious problems in this approach (I am familiar with lucene/solr code to >> some >> extent)? >> >> Thanks and best regards, Lisheng >> >> -----Original Message----- >> From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] >> Sent: Thursday, April 11, 2013 2:50 PM >> To: solr-user@lucene.apache.org >> Subject: Re: SolrCloud leader to replica >> >> >> But note that I misspoke, which I realized after re-reading the thread >> I pointed you to. Mark explains it nicely there: >> * the index call returns only when (and IF!) indexing to all replicas >> succeeds >> >> BUT, that should not be mixed with what search clients see! >> Just because the indexing client sees the all or nothing situation >> depending on whether indexing was successful on all replicas does NOT >> mean that search clients will always see a 100% consistent picture. >> Client A could hit the leader and see a newly indexed document, while >> client B could query the replica and not see that same document simply >> because the doc hasn't gotten there yet, or because soft commit hasn't >> happened just yet. >> >> Otis >> -- >> Solr & ElasticSearch Support >> http://sematext.com/ >> >> >> >> >> >> On Thu, Apr 11, 2013 at 4:39 PM, Zhang, Lisheng >> <lisheng.zh...@broadvision.com> wrote: >> > Thanks very much for your helps! >> > >> > -----Original Message----- >> > From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] >> > Sent: Thursday, April 11, 2013 1:23 PM >> > To: solr-user@lucene.apache.org >> > Subject: Re: SolrCloud leader to replica >> > >> > >> > Yes, I *think* that is the case. Some distributed systems have the >> > option to return success to caller only after data has been >> > added/indexed to N other nodes, but I think Solr doesn't have this >> > yet. Somebody please correct me if I'm wrong. >> > >> > See: http://search-lucene.com/?q=eventually+consistent&fc_project=Solr >> > >> > Otis >> > -- >> > Solr & ElasticSearch Support >> > http://sematext.com/ >> > >> > >> > >> > >> > >> > On Thu, Apr 11, 2013 at 12:51 PM, Zhang, Lisheng >> > <lisheng.zh...@broadvision.com> wrote: >> >> Hi Otis, >> >> >> >> Thanks very much for the quick help! We are considering to upgrade >> >> from solr 3.6 to 4x and use solrCloud, but we are concerned about >> >> performance related to replica? In this scenario it seems that the >> >> replica would be a few seconds beyond leader because replica would >> >> start indexing only afer leader finishes his? >> >> >> >> Thanks and best regards, Lisheng >> >> >> >> -----Original Message----- >> >> From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] >> >> Sent: Thursday, April 11, 2013 8:11 AM >> >> To: solr-user@lucene.apache.org >> >> Subject: Re: SolrCloud leader to replica >> >> >> >> >> >> I believe it indexes locally on leader first. Otherwise one could end >> >> up with a situation where indexing to replica(s) succeeds and indexing >> >> to leader fails, which I suspect might create a mess. >> >> >> >> Otis >> >> -- >> >> Solr & ElasticSearch Support >> >> http://sematext.com/ >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Apr 11, 2013 at 2:53 AM, Zhang, Lisheng >> >> <lisheng.zh...@broadvision.com> wrote: >> >>> Hi, >> >>> >> >>> In solr 4x solrCloud, suppose we have only one shard and >> >>> two replica, when leader receives the indexing request, >> >>> does it immediately forward request to two replicas or >> >>> it first indexes request itself, then sends request to its >> >>> two replica? >> >>> >> >>> Thanks very much for helps, Lisheng >> >>> >> >>> >>