Hi,
----- Original Message ---- > From: Jake Luciani <jak...@gmail.com> > To: solr-user@lucene.apache.org > Sent: Wed, March 9, 2011 8:07:00 PM > Subject: Re: True master-master fail-over without data gaps (choosing CA in >CAP) > > Yeah sure. Let me update this on the Solandra wiki. I'll send across the > link Excellent. You could include ES there, too, if you feel extra adventurous. ;) > I think you hit the main two shortcomings atm. - Grandma, why are your eyes so big? - To see you better. Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > -Jake > > On Wed, Mar 9, 2011 at 6:17 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com > > wrote: > > > Jake, > > > > Maybe it's time to come up with the Solandra/Solr matrix so we can see > > Solandra's strengths (e.g. RT, no replication) and weaknesses (e.g. I think > > I > > saw a mention of some big indices?) or missing feature (e.g. no delete by > > query), etc. > > > > Thanks! > > Otis > > ---- > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > ----- Original Message ---- > > > From: Jake Luciani <jak...@gmail.com> > > > To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org> > > > Sent: Wed, March 9, 2011 6:04:13 PM > > > Subject: Re: True master-master fail-over without data gaps (choosing CA > > in > > >CAP) > > > > > > Jason, > > > > > > It's predecessor did, Lucandra. But Solandra is a new approach that > > manages > > >shards of documents across the cluster for you and uses solrs distributed > > >search to query indexes. > > > > > > > > > Jake > > > > > > On Mar 9, 2011, at 5:15 PM, Jason Rutherglen < > > jason.rutherg...@gmail.com> > > >wrote: > > > > > > > Doesn't Solandra partition by term instead of document? > > > > > > > > On Wed, Mar 9, 2011 at 2:13 PM, Smiley, David W. <dsmi...@mitre.org> > > wrote: > > > >> I was just about to jump in this conversation to mention Solandra and > > go > > >fig, Solandra's committer comes in. :-) It was nice to meet you at > > Strata, > > >Jake. > > > >> > > > >> I haven't dug into the code yet but Solandra strikes me as a killer > > way to > > >scale Solr. I'm looking forward to playing with it; particularly looking > > at > > >disk requirements and performance measurements. > > > >> > > > >> ~ David Smiley > > > >> > > > >> On Mar 9, 2011, at 3:14 PM, Jake Luciani wrote: > > > >> > > > >>> Hi Otis, > > > >>> > > > >>> Have you considered using Solandra with Quorum writes > > > >>> to achieve master/master with CA semantics? > > > >>> > > > >>> -Jake > > > >>> > > > >>> > > > >>> On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic > > ><otis_gospodne...@yahoo.com > > > >>>> wrote: > > > >>> > > > >>>> Hi, > > > >>>> > > > >>>> ---- Original Message ---- > > > >>>> > > > >>>>> From: Robert Petersen <rober...@buy.com> > > > >>>>> > > > >>>>> Can't you skip the SAN and keep the indexes locally? Then you > > would > > > >>>>> have two redundant copies of the index and no lock issues. > > > >>>> > > > >>>> I could, but then I'd have the issue of keeping them in sync, which > > >seems > > > >>>> more > > > >>>> fragile. I think SAN makes things simpler overall. > > > >>>> > > > >>>>> Also, Can't master02 just be a slave to master01 (in the master > > farm > > >and > > > >>>>> separate from the slave farm) until such time as master01 fails? > > Then > > > >>>> > > > >>>> No, because it wouldn't be in sync. It would always be N minutes > > >behind, > > > >>>> and > > > >>>> when the primary master fails, the secondary would not have all the > > docs > > >- > > > >>>> data > > > >>>> loss. > > > >>>> > > > >>>>> master02 would start receiving the new documents with an indexes > > > >>>>> complete up to the last replication at least and the other slaves > > would > > > >>>>> be directed by LB to poll master02 also... > > > >>>> > > > >>>> Yeah, "complete up to the last replication" is the problem. It's a > > data > > > >>>> gap > > > >>>> that now needs to be filled somehow. > > > >>>> > > > >>>> Otis > > > >>>> ---- > > > >>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > >>>> Lucene ecosystem search :: http://search-lucene.com/ > > > >>>> > > > >>>> > > > >>>>> -----Original Message----- > > > >>>>> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] > > > >>>>> Sent: Wednesday, March 09, 2011 9:47 AM > > > >>>>> To: solr-user@lucene.apache.org > > > >>>>> Subject: Re: True master-master fail-over without data gaps > > (choosing > > >CA > > > >>>>> in CAP) > > > >>>>> > > > >>>>> Hi, > > > >>>>> > > > >>>>> > > > >>>>> ----- Original Message ---- > > > >>>>>> From: Walter Underwood <wun...@wunderwood.org> > > > >>>>> > > > >>>>>> On Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote: > > > >>>>>> > > > >>>>>>> You mean it's not possible to have 2 masters that are in > > nearly > > > >>>>> real-time > > > >>>>>> sync? > > > >>>>>>> How about with DRBD? I know people use DRBD to keep 2 Hadoop > > NNs > > > >>>>> (their > > > >>>>>> edit > > > >>>>>> > > > >>>>>>> logs) in sync to avoid the current NN SPOF, for example, so I'm > > > >>>>> thinking > > > >>>>>> this > > > >>>>>> > > > >>>>>>> could be doable with Solr masters, too, no? > > > >>>>>> > > > >>>>>> If you add fault-tolerant, you run into the CAP Theorem. > > >Consistency, > > > >>>>> > > > >>>>>> availability, partition: choose two. You cannot have it all. > > > >>>>> > > > >>>>> Right, so I'll take Consistency and Availability, and I'll put my > > 2 > > > >>>>> masters in > > > >>>>> the same rack (which has redundant switches, power supply, etc.) > > and > > > >>>>> thus > > > >>>>> minimize/avoid partitioning. > > > >>>>> Assuming the above actually works, I think my Q remains: > > > >>>>> > > > >>>>> How do you set up 2 Solr masters so they are in near real-time > > sync? > > > >>>>> DRBD? > > > >>>>> > > > >>>>> But here is maybe a simpler scenario that more people may be > > > >>>>> considering: > > > >>>>> > > > >>>>> Imagine 2 masters on 2 different servers in 1 rack, pointing to > > the > > >same > > > >>>>> index > > > >>>>> on the shared storage (SAN) that also happens to live in the same > > >rack. > > > >>>>> 2 Solr masters are behind 1 LB VIP that indexer talks to. > > > >>>>> The VIP is configured so that all requests always get routed to > > the > > > >>>>> primary > > > >>>>> master (because only 1 master can be modifying an index at a > > time), > > > >>>>> except when > > > >>>>> this primary is down, in which case the requests are sent to the > > > >>>>> secondary > > > >>>>> master. > > > >>>>> > > > >>>>> So in this case my Q is around automation of this, around Lucene > > index > > > >>>>> locks, > > > >>>>> around the need for manual intervention, and such. > > > >>>>> Concretely, if you have these 2 master instances, the primary > > master > > >has > > > >>>>> the > > > >>>>> Lucene index lock in the index dir. When the secondary master > > needs > > >to > > > >>>>> take > > > >>>>> over (i.e., when it starts receiving documents via LB), it needs > > to be > > > >>>>> able to > > > >>>>> write to that same index. But what if that lock is still around? > > One > > > >>>>> could use > > > >>>>> the Native lock to make the lock disappear if the primary > > master's JVM > > > >>>>> exited > > > >>>>> unexpectedly, and in that case everything *should* work and be > > > >>>>> completely > > > >>>>> transparent, right? That is, the secondary will start getting > > new > > >docs, > > > >>>>> it will > > > >>>>> use its IndexWriter to write to that same shared index, which > > won't be > > > >>>>> locked > > > >>>>> for writes because the lock is gone, and everyone will be happy. > > Did > > I > > > >>>>> miss > > > >>>>> something important here? > > > >>>>> > > > >>>>> Assuming the above is correct, what if the lock is *not* gone > > because > > > >>>>> the > > > >>>>> primary master's JVM is actually not dead, although maybe > > unresponsive, > > > >>>>> so LB > > > >>>>> thinks the primary master is dead. Then the LB will route > > indexing > > > >>>>> requests to > > > >>>>> the secondary master, which will attempt to write to the index, > > but > > be > > > >>>>> denied > > > >>>>> because of the lock. So a human needs to jump in, remove the > > lock, > > >and > > > >>>>> manually > > > >>>>> reindex failed docs if the upstream component doesn't buffer docs > > that > > > >>>>> failed to > > > >>>>> get indexed and doesn't retry indexing them automatically. Is > > this > > > >>>>> correct or > > > >>>>> is there a way to avoid humans here? > > > >>>>> > > > >>>>> Thanks, > > > >>>>> Otis > > > >>>>> ---- > > > >>>>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > >>>>> Lucene ecosystem search :: http://search-lucene.com/ > > > >>>>> > > > >>>> > > > >>> > > > >>> > > > >>> > > > >>> -- > > > >>> http://twitter.com/tjake > > > >> > > > >> > > > > > > > > > -- > http://twitter.com/tjake >