Okay, I'll see what I can do. Also for what it is worth, if anyone is in London tomorrow, I'm giving a presentation which covers this topic at the (free) Online Information 2010 exhibition at Kensington Olympia, at 3:20pm. Anyone interested is welcome to come along. I believe we're hoping to video it, so if successful, I expect it'll get put online somewhere.
Upayavira On Wed, 01 Dec 2010 03:44 +0000, "Jayant Das" <jayan...@hotmail.com> wrote: > > Hi, A diagram will be very much appreciated. > Thanks, > Jayant > > > From: u...@odoko.co.uk > > To: solr-user@lucene.apache.org > > Subject: Re: distributed architecture > > Date: Wed, 1 Dec 2010 00:39:40 +0000 > > > > I cannot say how mature the code for B) is, but it is not yet included > > in a release. > > > > If you want the ability to distribute content across multiple nodes (due > > to volume) and want resilience, then use both. > > > > I've had one setup where we have two master servers, each with four > > cores. Then we have two pairs of slaves. Each pair mirrors the masters, > > so we have two hosts covering each of our cores. > > > > Then comes the complicated bit to explain... > > > > Each of these four slave hosts had a core that was configured with a > > hardwired "shards" request parameter, which pointed to each of our > > shards. Actually, it pointed to VIPs on a load balancer. Those two VIPs > > then balanced across each of our pair of hosts. > > > > Then, put all four of these servers behind another VIP, and we had a > > single address we could push requests to, for sharded, and resilient > > search. > > > > Now if that doesn't make any sense, let me know and I'll have another go > > at explaining it (or even attempt a diagram). > > > > Upayavira > > > > On Tue, 30 Nov 2010 13:27 -0800, "Cinquini, Luca (3880)" > > <luca.cinqu...@jpl.nasa.gov> wrote: > > > Hi, > > > I'd like to know if anybody has suggestions/opinions on what is currently > > > the best architecture for a distributed search system using Solr. The use > > > case is that of a system composed > > > of N indexes, each hosted on a separate machine, each index containing > > > unique content. > > > > > > Options that I know of are: > > > > > > A) Using Solr distributed search > > > B) Using Solr + Zookeeper integration > > > C) Using replication, i.e. each node replicates all the others > > > > > > It seems like options A) and B) would suffer from a fault-tolerance > > > standpoint: if any of the nodes goes down, the search won't -at this > > > time- return partial results, but instead report an exception. > > > Option C) would provide fault tolerance, at least for any search > > > initiated at a node that is available, but would incur into a large > > > replication overhead. > > > > > > Did I get any of the above wrong, or does somebody have some insight on > > > what is the best system architecture for this use case ? > > > > > > thanks in advance, > > > Luca >