Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

Edd Grant Fri, 03 May 2013 07:36:00 -0700

Thanks, that's exactly what I was worried about. If I take your suggested
approach of using SolrCloudServer and the feeder learns which shard leader
to target, then if the shard leader goes down midway through indexing then
I've lost my ability to index. Whereas if I take the route of making all
updates via the HAProxy instance then I've got HA but at the cost of
performance.


This has me wondering if it might be feasable to address each shard with a
VIP? Then if the leader of the shard goes down and a replica is elected as
the leader it could also take the VIP, so in essence we'd always be sending
messages to the leader. Anyone tried anything like this?

Cheers,

Edd


On 3 May 2013 15:22, Furkan KAMACI <furkankam...@gmail.com> wrote:

> If you index them with SolrCloudServer, your server will learn where data
> will go from Zookeeper and send data to that shard leader. However if you
> use another random processes or something like data will go any of nodes
> and after that will be routed into the right place within cluster. This
> extra routing process within cluster may cause unnecessary network traffic
> and latency for indexing time as well.
>
> 2013/5/3 Edd Grant <e...@eddgrant.com>
>
> > Hi,
> >
> > No we're actually POSTing them over plain old http. Our "feeder" process
> > simply points at the HAProxy box and posts merrily away.
> >
> > Cheers,
> >
> > Edd
> >
> >
> > On 3 May 2013 13:17, Furkan KAMACI <furkankam...@gmail.com> wrote:
> >
> > > Do you use CloudSolrServer when you push documnts into SolrCloud to be
> > > indexed?
> > >
> > > 2013/5/3 Edd Grant <e...@eddgrant.com>
> > >
> > > > Hi all,
> > > >
> > > > I have been playing with Solr Cloud recently and am enjoying the
> > > > distributed indexing capability.
> > > >
> > > > At the moment my SolrCloud consists of 2 leaders and 2 replicas which
> > are
> > > > fronted by an HAProxy instance. I want to maximise performance for
> > > indexing
> > > > and it occurred to me that the model I use for loadbalancing my
> > indexing
> > > > requests may impact performance. i.e. am I likely to see better
> > indexing
> > > > performance if I stick certain groups of requests to certain nodes vs
> > > > simply using a round robin approach?
> > > >
> > > > I'll be doing some impirical testing to try and figure this out but
> was
> > > > wondering if there's any general guidance here? Or if anyone has any
> > > > experience of particularly good/ bad configurations?
> > > >
> > > > Many thanks,
> > > >
> > > > Edd
> > > >
> > > > --
> > > > Web: http://www.eddgrant.com
> > > > Email: e...@eddgrant.com
> > > > Mobile: +44 (0) 7861 394 543
> > > >
> > >
> >
> >
> >
> > --
> > Web: http://www.eddgrant.com
> > Email: e...@eddgrant.com
> > Mobile: +44 (0) 7861 394 543
> >
>



-- 
Web: http://www.eddgrant.com
Email: e...@eddgrant.com
Mobile: +44 (0) 7861 394 543

Re: Performance considerations when using distributed indexing + loadbalancing with Solr cloud

Reply via email to