I think there are two parts to this question:
* If a node just disappears you should be fine in terms of data
availability, since Solr in "SolrCloud" replicates the data as it comes it
(before sending the http response). Even if the leader disappears and never
comes back as long as you have one replica alive for that shard of that
collection there should be no data lost. A new leader will be elected and
you can continue adding docs or querying.
* If the node doesn't recover and a new one joins the cluster, currently
Solr won't automatically realize that replicas have disappear and create
them, so you need to take some action. Some good responses about this issue
are in this other thread
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201607.mbox/%3ccap_wmbugkdujin1unb_arvxq9vh3f5x6ybpgu7iqckawv9b...@mail.gmail.com%3E

I hope this helps,

Tomás

On Tue, Jul 5, 2016 at 8:55 AM, Steven Bower <sbo...@alcyon.net> wrote:

> You shouldn't "need" to move the storage as SolrCloud will replicate all
> data to the new node and anything in the transaction log will already be
> distributed through the rest of the machines..
>
> One option to keep all your data attached to nodes might be to use Amazon
> EFS (pretty new) to store your data.. However I've not seen any good perf
> testing done against it so not sure how it will scale..
>
> steve
>
> On Tue, Jul 5, 2016 at 11:46 AM Lorenzo Fundaró <
> lorenzo.fund...@dawandamail.com> wrote:
>
> > On 5 July 2016 at 15:55, Shawn Heisey <apa...@elyograg.org> wrote:
> >
> > > On 7/5/2016 1:19 AM, Lorenzo Fundaró wrote:
> > > > Hi Shawn. Actually what im trying to find out is whether this is the
> > best
> > > > approach for deploying solr in the cloud. I believe solrcloud solves
> a
> > > lot
> > > > of problems in terms of High Availability but when it comes to
> storage
> > > > there seems to be a limitation that can be workaround of course but
> > it's
> > > a
> > > > bit cumbersome and i was wondering if there is a better option for
> this
> > > or
> > > > if im missing something with the way I'm doing it. I wonder if there
> > are
> > > > some proved experience about how to solve the storage problem when
> > > > deploying in the cloud. Any advise or point to some enlightening
> > > > documentation will be appreciated. Thanks.
> > >
> > > When you ask whether "this is the best approach" ... you need to define
> > > what "this" is.  You mention a "storage problem" that needs solving ...
> > > but haven't actually described that problem in a way that I can
> > > understand.
> >
> >
> > So, Im trying to put Solrcloud in a cloud provider where a node can
> > disappear any time
> > because of hardware failure. In order to preserve any non replicated
> > updates I need to
> > make the storage of that dead node go to the newly spawned node. I am not
> > having a problem with this
> > approach actually, I just want to know if there is a better way of doing
> > this. I know there is HDFS support that makes
> > all this easier but this is not an option for me. Thank you and I
> apologise
> > for the unclear mails.
> >
> >
> > >
> > > Let's back up and cover some basics:
> > >
> > > What steps are you taking?
> >
> > What do you expect (or want) to happen?
> >
> > What actually happens?
> > >
> > > The answers to these questions need to be very detailed.
> > >
> > > Thanks,
> > > Shawn
> > >
> > >
> >
> >
> > --
> >
> > --
> > Lorenzo Fundaro
> > Backend Engineer
> > E-Mail: lorenzo.fund...@dawandamail.com
> >
> > Fax       + 49 - (0)30 - 25 76 08 52
> > Tel        + 49 - (0)179 - 51 10 982
> >
> > DaWanda GmbH
> > Windscheidstraße 18
> > 10627 Berlin
> >
> > Geschäftsführer: Claudia Helming und Niels Nüssler
> > AG Charlottenburg HRB 104695 B http://www.dawanda.com
> >
>

Reply via email to