Re: Request routing / load-balancing TLOG & PULL replica types

Greg Roodt Mon, 12 Feb 2018 16:10:36 -0800

Thanks so much again Tomas! You've answered my questions and I clearly
understand now. Great work!


On 13 February 2018 at 09:13, Tomas Fernandez Lobbe <tflo...@apple.com>
wrote:

>
>
> > On Feb 12, 2018, at 12:06 PM, Greg Roodt <gro...@gmail.com> wrote:
> >
> > Thanks Ere. I've taken a look at the discussion here:
> > http://lucene.472066.n3.nabble.com/Limit-search-
> queries-only-to-pull-replicas-td4367323.html
> > This is how I was imagining TLOG & PULL replicas would wor, so if this
> > functionality does get developed, it would be useful to me.
> >
> > I still have 2 questions at the moment:
> > 1. I am running the single shard scenario. I'm thinking of using a
> > dedicated HTTP load-balancer in front of the PULL replicas only with
> > read-only queries directed directly at the load-balancer. In this
> > situation, the healthy PULL replicas *should* handle the queries on the
> > node itself without a proxy hop (assuming state=active). New PULL
> replicas
> > added to the load-balancer will internally proxy queries to the other
> PULL
> > or TLOG replicas while in state=recovering until the switch to
> > state=active. Is my understanding correct?
>
> Yes
>
> >
> > 2. Is it all worth it? Is there any advantage to running a cluster of 3
> > TLOGs + 10 PULL replicas vs running 13 TLOG replicas?
> >
>
> I don’t have a definitive answer, this will depend on your specific use
> case. As Erick said, there is very little work that non-leader TLOG
> replicas do for each update, and having all TLOG replicas means that with a
> single active replica you could in theory handle updates. It’s sometimes
> nice to separate query traffic from update traffic, but this can still be
> done if you have all TLOG replicas and you just make sure you don’t query
> the leader…
> One nice characteristic that PULL replicas have is that they can’t go into
> Leader Initiated Recovery (LIR) state, even if there is some sort of
> network partition, they’ll remain in active state even if they can’t talk
> with the leader as long as they can reach ZooKeeper (note that this means
> they may be responding with outdated data for an undetermined amount of
> time, until replicas can replicate from the leader again). Also, since
> updates are not sent to all the replicas (only the TLOG replicas), updates
> should be faster with 3 TLOG vs 13 TLOG replicas.
>
>
> Tomás
>
> >
> >
> >
> > On 12 February 2018 at 19:25, Ere Maijala <ere.maij...@helsinki.fi>
> wrote:
> >
> >> Your question about directing queries to PULL replicas only has been
> >> discussed on the list. Look for topic "Limit search queries only to pull
> >> replicas". What I'd like to see is something similar to the
> >> preferLocalShards parameter. It could be something like
> >> "preferReplicaTypes=TLOG,PULL". Tomás mentioned previously that
> >> SOLR-10880 could be used as a base for such funtionality, and I'm
> >> considering taking a stab at implementing it.
> >>
> >> --Ere
> >>
> >>
> >> Greg Roodt kirjoitti 12.2.2018 klo 6.55:
> >>
> >>> Thank you both for your very detailed answers.
> >>>
> >>> This is great to know. I knew that SolrJ had the cluster aware
> knowledge
> >>> (via zookeeper), but I was wondering what something like curl would do.
> >>> Great to know that internally the cluster will proxy queries to the
> >>> appropriate place regardless.
> >>>
> >>> I am running the single shard scenario. I'm thinking of using a
> dedicated
> >>> HTTP load-balancer in front of the PULL replicas only with read-only
> >>> queries directed directly at the load-balancer. In this situation, the
> >>> healthy PULL replicas *should* handle the queries on the node itself
> >>> without a proxy hop (assuming state=active). New PULL replicas added to
> >>> the
> >>> load-balancer will internally proxy queries to the other PULL or TLOG
> >>> replicas while in state=recovering until the switch to state=active.
> >>>
> >>> Is my understanding correct?
> >>>
> >>> Is this sensible to do, or is it not worth it due to the smart proxying
> >>> that SolrCloud can do anyway?
> >>>
> >>> If the TLOG and PULL replicas are so similar, is there any real
> advantage
> >>> to having a mixed cluster? I assume a bit less work is required across
> the
> >>> cluster to propagate writes if you only have 3 TLOG nodes vs 10+ PULL
> >>> nodes? Or would it be better to just have 13 TLOG nodes?
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On 12 February 2018 at 15:24, Tomas Fernandez Lobbe <tflo...@apple.com
> >
> >>> wrote:
> >>>
> >>> On the last question:
> >>>> For Writes: Yes. Writes are going to be sent to the shard leader, and
> >>>> since PULL replicas can’t  be leaders, it’s going to be a TLOG
> replica.
> >>>> If
> >>>> you are using CloudSolrClient, then this routing will be done directly
> >>>> from
> >>>> the client (since it will send the update to the leader), and if you
> are
> >>>> using some other HTTP client, then yes, the PULL replica will forward
> the
> >>>> update, the same way any non-leader node would.
> >>>>
> >>>> For reads: this won’t happen today, and any replica can respond to
> >>>> queries. I do believe there is value in this kind of routing logic,
> >>>> sometimes you simply don’t want the leader to handle any queries,
> >>>> specially
> >>>> when queries can be expensive. You could do this today if you want, by
> >>>> putting some load balancer in front and just direct your queries to
> the
> >>>> nodes you know are PULL, but keep in mind that this would only work in
> >>>> the
> >>>> single shard scenario, and only if you hit an active replica
> (otherwise,
> >>>> as
> >>>> you said, the query will be routed to any other node of the shard,
> >>>> regardless of the type), if you have multiple shards then you need to
> use
> >>>> the “shards” parameter and tell Solr exactly which nodes you want to
> hit
> >>>> for each shard (the “shards” approach can also be done in the single
> >>>> shard
> >>>> case, although you would be adding an extra hop I believe)
> >>>>
> >>>> Tomás
> >>>> Sent from my iPhone
> >>>>
> >>>> On Feb 11, 2018, at 6:35 PM, Greg Roodt <gro...@gmail.com> wrote:
> >>>>>
> >>>>> Hi
> >>>>>
> >>>>> I have a question around how queries are routed and load-balanced in
> a
> >>>>> cluster of mixed TLOG and PULL replicas.
> >>>>>
> >>>>> I thought that I might have to put a load-balancer in front of the
> PULL
> >>>>> replicas and direct queries at them manually as nodes are added and
> >>>>>
> >>>> removed
> >>>>
> >>>>> as PULL replicas. However, it seems that SolrCloud handles this
> >>>>> automatically?
> >>>>>
> >>>>> If I add a new PULL replica node, it goes into state="recovering"
> while
> >>>>>
> >>>> it
> >>>>
> >>>>> pulls the core. As expected. What happens if queries are directed at
> >>>>> this
> >>>>> node while in this state? From what I am observing, the query gets
> >>>>>
> >>>> directed
> >>>>
> >>>>> to another node?
> >>>>>
> >>>>> If SolrCloud is handling the routing of requests to active nodes,
> will
> >>>>> it
> >>>>> automatically favour PULL replicas for read queries and TLOG replicas
> >>>>> for
> >>>>> writes?
> >>>>>
> >>>>> Thanks
> >>>>> Greg
> >>>>>
> >>>>
> >>>>
> >>>
> >> --
> >> Ere Maijala
> >> Kansalliskirjasto / The National Library of Finland
> >>
>
>

Re: Request routing / load-balancing TLOG & PULL replica types

Reply via email to