Thank you very much for all your help. On Tue 9 Jan 2018, 16:32 Erick Erickson, <erickerick...@gmail.com> wrote:
> One thing to be aware of is that the commit points on the replicas in a > replica may (will) fire at different times. So when you're comparing the > number of docs on the replicas in a shard you have to compare before the > last commit interval. So say you have a soft commit of 1 minute. When > comparing the docs on each shard you need to restrict the query to things > older than 1 minute or stop indexing and wait for 1 minute (i.e. until > after the autocommit fires). > > Glad things worked out! > Erick > > On Tue, Jan 9, 2018 at 4:08 AM, Novin Novin <toe.al...@gmail.com> wrote: > > > Hi Erick, > > > > Apology for delay. > > > > [This isn't what I meant. I meant to query each replica directly > > _within_ the same shard. Your problem statement is that the leader and > > replicas (I use "followers") have different document counts. How are > > you verifying this? Through the admin UI? Using &distrib=false is > > useful when you want to query each core directly (and you have to use > > the core name) in some automated fashion.] > > > > I might be wrong here because now I can't produce it with distrib=false > > > > I also did as you said > > [OK, I'm assuming then that you issue a manual commit sometime, right? > > Here's what I'd do: > > 1> turn off indexing > > 2> issue a commit (soft or hard-with-opensearcher-true) > > 3> now look at your doc counts on each replica.] > > > > Everything is seems ok now, I must have doing something wrong before. > > > > Thanks for all yours and walter's help > > Best, > > Navin > > > > > > On Wed, 3 Jan 2018 at 17:09 Walter Underwood <wun...@wunderwood.org> > > wrote: > > > > > If you have a field for the indexed datetime, you can use a filter > query > > > to get rid of recent updates that might be in transit. I’d use double > the > > > autocommit time, to leave time for the followers to index. > > > > > > If the autocommit interval is one minute: > > > > > > fq=indexed_datetime:[* TO NOW-2MIN] > > > > > > wunder > > > Walter Underwood > > > wun...@wunderwood.org > > > http://observer.wunderwood.org/ (my blog) > > > > > > > > > > On Jan 3, 2018, at 8:58 AM, Erick Erickson <erickerick...@gmail.com> > > > wrote: > > > > > > > > [I probably not need to do this because I have only one shard but I > did > > > > anyway count was different.] > > > > > > > > This isn't what I meant. I meant to query each replica directly > > > > _within_ the same shard. Your problem statement is that the leader > and > > > > replicas (I use "followers") have different document counts. How are > > > > you verifying this? Through the admin UI? Using &distrib=false is > > > > useful when you want to query each core directly (and you have to use > > > > the core name) in some automated fashion. > > > > > > > > [I have actually turned off auto soft commit for a time being but > > > > nothing changed] > > > > > > > > OK, I'm assuming then that you issue a manual commit sometime, right? > > > > Here's what I'd do: > > > > 1> turn off indexing > > > > 2> issue a commit (soft or hard-with-opensearcher-true) > > > > 3> now look at your doc counts on each replica. > > > > > > > > If the counts are different then something's not right, Solr tries > > > > very hard to not lose data, it's concerning if the leader and > replicas > > > > have different counts. > > > > > > > > Best, > > > > Erick > > > > > > > > On Wed, Jan 3, 2018 at 1:51 AM, Novin Novin <toe.al...@gmail.com> > > wrote: > > > >> Hi Erick, > > > >> > > > >> Thanks for your reply. > > > >> > > > >> [ First of all, replicas can be off in terms of counts for the soft > > > >> commit interval. The commits don't all happen on the replicas at the > > > >> same wall-clock time. Solr promises eventual consistency, in this > case > > > >> NOW-autocommit time.] > > > >> > > > >> I realized that, to stop it. I have actually turned off auto soft > > commit > > > >> for a time being but nothing changed. Non leader replica still had > > extra > > > >> documents. > > > >> > > > >> [ So my first question is whether the replicas in the shard are > > > >> inconsistent as of, say, NOW-your_soft_commit_time. I'd add a fudge > > > >> factor of 10 seconds earlier just to be sure I was past autowarming. > > > >> This does require that there be a time stamp. Absent a timestamp, > you > > > >> could suspend indexing for a few minutes and run the test like > below.] > > > >> > > > >> When data was indexing at that time I was checking how the counts > are > > in > > > >> both replica. What I found leader replica has 3 doc less than other > > > replica > > > >> always. I don't think so they were of by NOW-soft_commit_time, > > > CloudSolrClient > > > >> add some thing like this "_stateVer_=main:114" in query which I > assume > > > is > > > >> for results to be consistent between both replica search. > > > >> > > > >> [Adding &distrib=false to your command and directing it at a > specific > > > >> _core_ (something like collection1_shard1_replica1) will only return > > > >> data from that core.] > > > >> I probably not need to do this because I have only one shard but I > did > > > >> anyway count was different. > > > >> > > > >> [When you say you index every minute, I'm guessing you only index > for > > > >> part of that minute, is that true? In that case you might get more > > > >> consistency if, instead of relying totally on your autoconfig > > > >> settings, specify commitWithin on your update command. That should > > > >> force the commits to happen more closely in-sync, although still not > > > >> perfect.] > > > >> > > > >> We receive data every minute, so whenever we have new data we send > it > > to > > > >> Solr cloud using queue. You said don't rely on auto config. Do you > > mean > > > I > > > >> should turn off autocommit and use commitWithin using solrj or leave > > > >> autoCommit as it is and also use commitWithin from solrj client. > > > >> > > > >> I apologize If I am not clear, thanks for your help again. > > > >> > > > >> Thanks in advance, > > > >> Navin > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> On Tue, 2 Jan 2018 at 18:05 Erick Erickson <erickerick...@gmail.com > > > > > wrote: > > > >> > > > >>> First of all, replicas can be off in terms of counts for the soft > > > >>> commit interval. The commits don't all happen on the replicas at > the > > > >>> same wall-clock time. Solr promises eventual consistency, in this > > case > > > >>> NOW-autocommit time. > > > >>> > > > >>> So my first question is whether the replicas in the shard are > > > >>> inconsistent as of, say, NOW-your_soft_commit_time. I'd add a fudge > > > >>> factor of 10 seconds earlier just to be sure I was past > autowarming. > > > >>> This does require that there be a time stamp. Absent a timestamp, > you > > > >>> could suspend indexing for a few minutes and run the test like > below. > > > >>> > > > >>> Adding &distrib=false to your command and directing it at a > specific > > > >>> _core_ (something like collection1_shard1_replica1) will only > return > > > >>> data from that core. > > > >>> > > > >>> When you say you index every minute, I'm guessing you only index > for > > > >>> part of that minute, is that true? In that case you might get more > > > >>> consistency if, instead of relying totally on your autoconfig > > > >>> settings, specify commitWithin on your update command. That should > > > >>> force the commits to happen more closely in-sync, although still > not > > > >>> perfect. > > > >>> > > > >>> Another option if you're totally and completely sure that your > > commits > > > >>> happen _only_ from your indexing program is to fire the commit at > the > > > >>> end of the run from your SolrJ program. > > > >>> > > > >>> Let us know, > > > >>> Erick > > > >>> > > > >>> On Tue, Jan 2, 2018 at 9:33 AM, Novin Novin <toe.al...@gmail.com> > > > wrote: > > > >>>> Hi Erick, > > > >>>> > > > >>>> You are right, it is XY Problem. > > > >>>> > > > >>>> Allow me to explain best I can, I have two replica of one > collection > > > >>> called > > > >>>> "Main". When I was using search feature in my application I get > two > > > >>>> different numFound count. So I start digging after spending 2 3 > > hours > > > I > > > >>>> found the one replica has numFound count higher than other (higher > > > count > > > >>>> was not leader). I am not sure how It got end up like that. This > > count > > > >>>> difference affects paging on my application side not solr side. > > > >>>> > > > >>>> Extra info might be useful to know > > > >>>> Same query not a single letter difference. > > > >>>> auto soft commit 20000 > > > >>>> soft commit 60000 > > > >>>> indexing data every minute. > > > >>>> > > > >>>> Let me know if you need to know anything else. Any help would > highly > > > >>>> appreciated. > > > >>>> > > > >>>> Thanks in advance, > > > >>>> Navin > > > >>>> > > > >>>> > > > >>>> > > > >>>> On Tue, 2 Jan 2018 at 15:14 Erick Erickson < > erickerick...@gmail.com > > > > > > >>> wrote: > > > >>>> > > > >>>>> This seems like an XY problem. You're asking how to do X > > > >>>>> because you think it will solve problem Y without telling > > > >>>>> us what Y is. > > > >>>>> > > > >>>>> I say this because on the surface this seems to defeat the > > > >>>>> purpose behind SolrCloud. Why would you want to only make > > > >>>>> use of one piece of hardware? That will limit your throughput, > > > >>>>> so why bother to have replicas in the first place? > > > >>>>> > > > >>>>> Or is this some kind of diagnostic you're trying to implement? > > > >>>>> > > > >>>>> Best, > > > >>>>> Erick > > > >>>>> > > > >>>>> On Tue, Jan 2, 2018 at 5:08 AM, Novin Novin <toe.al...@gmail.com > > > > > >>> wrote: > > > >>>>>> Hi guys, > > > >>>>>> > > > >>>>>> I am using solr 5.5.4 and same version for solrj. My question is > > > there > > > >>>>> any > > > >>>>>> way I can tell cloud solr client to use only leader for queries. > > > >>>>>> > > > >>>>>> Thanks in advance. > > > >>>>>> Navin > > > >>>>> > > > >>> > > > > > > > > >