Hi Tomás,

No, I am not seeing reloads. I am trying to understand the interactions
between hard commit, soft commit, transaction log update with a TLOG
cluster for both leader and follower replicas. For example, after getting
new segments from the leader the follower replica will still apply the
hard/soft commit?

PS: congratulations on the Berlin Buzzwords' talk. :)

Thanks!

On Mon, Dec 10, 2018 at 9:24 PM Tomás Fernández Löbbe <tomasflo...@gmail.com>
wrote:

> I think this is a good point. The tricky part is that if TLOG replicas
> don't replicate often, their transaction logs will get too big too, so you
> want the replication interval of TLOG replicas to be tied to the
> auto(hard)Commit interval (by default at least). If you are using them for
> search, you may also not want to open a searcher for each fetch... for PULL
> replicas, maybe the best way is to use the autoSoftCommit interval to
> define the polling interval. That said, I'm not sure using different
> configurations is a good idea, some people may be mixing TLOG and PULL and
> querying them both alike.
>
> In the meantime, if you have different hosts for TLOG and PULL replicas,
> one workaround you can have is to define the autoCommit time with a system
> property, and use different properties for TLOGs vs PULL nodes.
>
> > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > Followers fetch the segments and **reload the core** every 150 seconds
>
> Edward, "reload" shouldn't really happen in regular TLOG/PULL fetches. Are
> you seeing reloads?
>
> On Mon, Dec 10, 2018 at 4:41 PM Erick Erickson <erickerick...@gmail.com>
> wrote:
>
> > bq. but not every poll attempt they fetch new segment from the leader
> >
> > Ah, right. Ignore my comment. Commit will only occur on the followers
> > when there are new segments to pull down, so your'e right, roughly
> > every second poll would commit find things to bring down and open a
> > new searcher.........
> > On Sun, Dec 9, 2018 at 4:14 PM Edward Ribeiro <edward.ribe...@gmail.com>
> > wrote:
> > >
> > > Hi Vadim,
> > >
> > > There is no commit on TLOG/PULL  follower replicas, only on the leader.
> > > Followers fetch the segments and **reload the core** every 150 seconds
> > (if
> > > there were new segments, I suppose). Yeah, followers don't pay the CPU
> > > price of indexing, but there are still cache invalidation, autowarming,
> > > etc, in addition to network and IO demand. Is that ritht, Erick?
> > >
> > > Besides that, Erick is pointing out that under a heavy indexing
> workload
> > > you could either have:
> > >
> > > 1. Very large transaction logs;
> > >
> > > 2. Very large numbers of segments. If that is the case, you could have
> > the
> > > following scenario numerous times:
> > >    2.1. follower replica downloads segment A and B from leader;
> > >    2.2 leader merges segments A + B into C;
> > >    2.3. follower replicas discard A and B and download C on next poll;
> > >
> > > Under the second condition followers needlessly downloaded segments
> that
> > > would eventually be merged.
> > >
> > > IMO, you should carefully evaluate if the use of TLOG/PULL is really
> > > recommended for your cluster setup, plus indexing and querying
> workload.
> > > You can very much stay with a NRT setup if it suits you better. The
> > videos
> > > below provide a nice set of hints for when to choose between NRT or
> some
> > > combination of TLOG and PULL.
> > >
> > > https://youtu.be/XIb8X3MwVKc
> > >
> > > https://youtu.be/dkWy2ykzAv0
> > >
> > > https://youtu.be/XqfTjd9KDWU
> > >
> > > Regards,
> > > Edward
> > >
> > > Em dom, 9 de dez de 2018 16:56, <vadim.iva...@spb.ntk-intourist.ru
> > escreveu:
> > >
> > > >
> > > >  If hard commit max time is 300 sec then commit happens every 300 sec
> > on
> > > > tlog leader. And new segments pop up on the leader every 300 sec,
> > during
> > > > indexing. Polling interval on other replicas 150 sec, but not every
> > poll
> > > > attempt they fetch new segment from the leader, afaiu. Erick, do you
> > mean
> > > > that on all other  tlog replicas(not leaders) commit occurs every
> poll?
> > > > воскресенье, 09 декабря 2018г., 19:21 +03:00 от Erick Erickson
> > > > erickerick...@gmail.com :
> > > >
> > > > >Not quite, 600000. The polling interval is half the commit
> > interval....
> > > > >
> > > > >This has always bothered me a little bit, I wonder at the utility
> of a
> > > > >config param. We already have old-style replication with a
> > > > >configurable polling interval. Under very heavy indexing loads, it
> > > > >seems to me that either the tlogs will grow quite large or we'll be
> > > > >pulling a lot of unnecessary segments across the wire, segments
> > > > >that'll soon be merged away and the merged segment re-pulled.
> > > > >
> > > > >Apparently, though, nobody's seen this "in the wild", so it's
> > > > >theoretical at this point.
> > > > >On Sun, Dec 9, 2018 at 1:48 AM Vadim Ivanov
> > > > < vadim.iva...@spb.ntk-intourist.ru> wrote:
> > > > >
> > > > > Thanks, Edward, for clues.
> > > > > What bothers me is newSearcher start, warming, cache clear... all
> > that
> > > > CPU consuming stuff in my heavy-indexing scenario.
> > > > > With NRT I had autoSoftCommit:  300000 .
> > > > > So I had new Searcher no more than  every 5 min on every replica.
> > > > > To have more or less  the same effect with TLOG - PULL collection,
> > > > > I suppose, I have to have  :  300000
> > > > > (yes, I understand that newSearchers start asynchronously on leader
> > and
> > > > replicas)
> > > > > Am I right?
> > > > > --
> > > > > Vadim
> > > > >
> > > > >
> > > > >> -----Original Message-----
> > > > >> From: Edward Ribeiro [mailto:edward.ribe...@gmail.com]
> > > > >> Sent: Sunday, December 09, 2018 12:42 AM
> > > > >> To:  solr-user@lucene.apache.org
> > > > >> Subject: Re: Soft commit and new replica types
> > > > >>
> > > > >> Some insights in the new replica types below:
> > > > >>
> > > > >> On Sat, December 8, 2018 08:42, Vadim Ivanov <
> > > > >> vadim.iva...@spb.ntk-intourist.ru wrote:
> > > > >>
> > > > >>>
> > > > >>> From Ref guide we have:
> > > > >>> " NRT is the only type of replica that supports soft-commits..."
> > > > >>> "If TLOG replica does become a leader, it will behave the same as
> > if it
> > > > >>> was a NRT type of replica."
> > > > >>> Does it mean, that if we do not have NRT replicas in the cluster
> > then
> > > > >>> autoSoftCommit section in solconfig.xml Ignored completely (even
> on
> > > > TLOG
> > > > >>> leader)?
> > > > >>>
> > > > >>
> > > > >> No, not completely. Both TLOG and PULL nodes will periodically
> poll
> > the
> > > > >> leader for changes in index segments' files and download those
> > segments
> > > > >> from the leader. If hard commit max time is defined in
> > solrconfig.xml
> > > > the
> > > > >> polling interval of each replica will be half that value. Or else
> > if the
> > > > >> soft commit max time is defined then the replicas will use half
> the
> > soft
> > > > >> commit max time as the interval. If neither are defined then the
> > poll
> > > > >> interval will be 3 seconds (hard coded). See here:
> > > > >> https://github.com/apache/lucene-
> > > > >>
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > >> rg/apache/solr/cloud/ReplicateFromLeader.java#L68-L77
> > > > >>
> > > > >> If the TLOG is the leader it will index locally and append the doc
> > to
> > > > >> transaction log as a NRT node would do as well as it will
> > synchronously
> > > > >> replicate the data to other TLOG replicas' transaction logs (PULL
> > nodes
> > > > >> don't have transaction logs). But TLOG/PULL replicas doesn't
> support
> > > > soft
> > > > >> commits nor real time gets, afaik.
> > > > >>
> > > > >>>
> > > > >>
> > > > >>>
> > > > >>> 60000
> > > > >>>
> > > > >>>
> > > > >>> Should we say that in autoCommit section openSearcher is always
> > true in
> > > > >>> that case?
> > > > >>
> > > > >>
> > > > >>
> > > > >> 10000
> > > > >> 30000
> > > > >> 512m
> > > > >> false
> > > > >>
> > > > >>
> > > > >> Does it mean that new Searcher always starts on all replicas when
> > hard
> > > > >> commit happens on leader?
> > > > >>
> > > > >>
> > > > >> Nope. Or at least, the searcher is not synchronously created. Each
> > non
> > > > >> leader replica will periodically fetch the index changes from the
> > leader
> > > > >> and open a new searcher to reflect those changes as seen here:
> > > > >> https://github.com/apache/lucene-
> > > > >>
> > solr/blob/75b183196798232aa6f2dcaaaab117f309119053/solr/core/src/java/o
> > > > >> rg/apache/solr/handler/IndexFetcher.java#L653
> > > > >> But it's important to note that the potential delay between the
> > leader's
> > > > >> hard commit and the other replicas fetching those changes from the
> > > > leader
> > > > >> and opening a new searcher to reflect latest changes.
> > > > >>
> > > > >> PS: I am still digging these new replica types so I can have
> > > > misunderstood
> > > > >> or missed some aspect of it.
> > > > >>
> > > > >> Regards,
> > > > >> Edward
> > > > >
> > > >
> >
>

Reply via email to