Re: SolrCloud leader to replica

Otis Gospodnetic Thu, 11 Apr 2013 17:27:55 -0700

Hi,

I think Timothy is right about what Lisheng is really after, which is
consistency.


I agree with what Timothy is implying here - changes of search being
inconsistent are very, very small.  I'm guessing Lisheng is trying to
solve a problem he doesn't actually have yet?  Also, think about a
non-SolrCloud solution.  What happens when a user pages through
results?  Typically that just re-runs the same query, but with a
different page offset.  What happens if between page 1 and page 2 the
index changes and a searcher is reopened?  Same sort of problem can
happen, right?  Yet, in a few hundred client engagements involving
Solr or ElasticSearch I don't recall this ever being an issue.

Otis
--
Solr & ElasticSearch Support
http://sematext.com/





On Thu, Apr 11, 2013 at 8:13 PM, Timothy Potter <thelabd...@gmail.com> wrote:
> Hmmm ... I was following this discussion but then got confused when Lisheng
> said to change Solr to "compromise consistency in order to increase
> availability" when your concern is "how long replica is behind leader".
> Seems you want more consistency vs. less in this case? One of the reasons
> behind Solr's leader election approach is to achieve low-latency eventual
> consistency (Mark's term from the linked to discussion).
>
> Un-committed docs are only visible if you use real-time get, in which case
> the request is served by the shard leader (or replica) from its update log.
> I suppose there's a chance of a few millis between the leader having the
> request in its tlog and the replica having the doc it its tlog but that
> seems like the nature of the beast. Meaning that Solr never promised to be
> 100% consistent at millisecond granularity in a distributed model - any
> small time-window between what a leader has and replica are probably
> network latency which you should solve outside of Solr. I suspect you could
> direct all your real-time get requests to leaders only using some smart
> client like CloudSolrServer if it mattered that much.
>
> Otherwise, all other queries require the document to be committed to be
> visible. I suppose there is a very small window when a new searcher is open
> on the leader and the new searcher is not yet open on the replica. However,
> with soft-commits, that too seems like a milli or two based on network
> latency.
>
> @Shawn - yes, I've actually seen this work in my cluster. We lose replicas
> from time-to-time and indexing keeps on trucking.
>
>
>
>
>
> On Thu, Apr 11, 2013 at 4:51 PM, Zhang, Lisheng <
> lisheng.zh...@broadvision.com> wrote:
>
>> Hi Otis,
>>
>> Thanks very much for helps, your explanation is very clear.
>>
>> My main concern is not the return status for indexing calls (although
>> which is
>> also important), my main concern is how long replica is behind the leader
>> (or
>> putting in your way, how consistent search picture is to client A and B).
>>
>> Our application requires clients see same result whether he hits leader or
>> replica, so it seems we do have a problem here. If no better solution I may
>> consider to change solr4 a little (I have not read solr4x fully yet) to
>> compromise
>> consistency (C) in order to increase availability (A), on a high level do
>> you see
>> serious problems in this approach (I am familiar with lucene/solr code to
>> some
>> extent)?
>>
>> Thanks and best regards, Lisheng
>>
>> -----Original Message-----
>> From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
>> Sent: Thursday, April 11, 2013 2:50 PM
>> To: solr-user@lucene.apache.org
>> Subject: Re: SolrCloud leader to replica
>>
>>
>> But note that I misspoke, which I realized after re-reading the thread
>> I pointed you to.  Mark explains it nicely there:
>> * the index call returns only when (and IF!) indexing to all replicas
>> succeeds
>>
>> BUT, that should not be mixed with what search clients see!
>> Just because the indexing client sees the all or nothing situation
>> depending on whether indexing was successful on all replicas does NOT
>> mean that search clients will always see a 100% consistent picture.
>> Client A could hit the leader and see a newly indexed document, while
>> client B could query the replica and not see that same document simply
>> because the doc hasn't gotten there yet, or because soft commit hasn't
>> happened just yet.
>>
>> Otis
>> --
>> Solr & ElasticSearch Support
>> http://sematext.com/
>>
>>
>>
>>
>>
>> On Thu, Apr 11, 2013 at 4:39 PM, Zhang, Lisheng
>> <lisheng.zh...@broadvision.com> wrote:
>> > Thanks very much for your helps!
>> >
>> > -----Original Message-----
>> > From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
>> > Sent: Thursday, April 11, 2013 1:23 PM
>> > To: solr-user@lucene.apache.org
>> > Subject: Re: SolrCloud leader to replica
>> >
>> >
>> > Yes, I *think* that is the case.  Some distributed systems have the
>> > option to return success to caller only after data has been
>> > added/indexed to N other nodes, but I think Solr doesn't have this
>> > yet.  Somebody please correct me if I'm wrong.
>> >
>> > See: http://search-lucene.com/?q=eventually+consistent&fc_project=Solr
>> >
>> > Otis
>> > --
>> > Solr & ElasticSearch Support
>> > http://sematext.com/
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Apr 11, 2013 at 12:51 PM, Zhang, Lisheng
>> > <lisheng.zh...@broadvision.com> wrote:
>> >> Hi Otis,
>> >>
>> >> Thanks very much for the quick help! We are considering to upgrade
>> >> from solr 3.6 to 4x and use solrCloud, but we are concerned about
>> >> performance related to replica? In this scenario it seems that the
>> >> replica would be a few seconds beyond leader because replica would
>> >> start indexing only afer leader finishes his?
>> >>
>> >> Thanks and best regards, Lisheng
>> >>
>> >> -----Original Message-----
>> >> From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
>> >> Sent: Thursday, April 11, 2013 8:11 AM
>> >> To: solr-user@lucene.apache.org
>> >> Subject: Re: SolrCloud leader to replica
>> >>
>> >>
>> >> I believe it indexes locally on leader first.  Otherwise one could end
>> >> up with a situation where indexing to replica(s) succeeds and indexing
>> >> to leader fails, which I suspect might create a mess.
>> >>
>> >> Otis
>> >> --
>> >> Solr & ElasticSearch Support
>> >> http://sematext.com/
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> On Thu, Apr 11, 2013 at 2:53 AM, Zhang, Lisheng
>> >> <lisheng.zh...@broadvision.com> wrote:
>> >>> Hi,
>> >>>
>> >>> In solr 4x solrCloud, suppose we have only one shard and
>> >>> two replica, when leader receives the indexing request,
>> >>> does it immediately forward request to two replicas or
>> >>> it first indexes request itself, then sends request to its
>> >>> two replica?
>> >>>
>> >>> Thanks very much for helps, Lisheng
>> >>>
>> >>>
>>

Re: SolrCloud leader to replica

Reply via email to