@Tomas and @Steven I am a bit skeptical about this two statements:
If a node just disappears you should be fine in terms of data > availability, since Solr in "SolrCloud" replicates the data as it comes it > (before sending the http response) and > > You shouldn't "need" to move the storage as SolrCloud will replicate all > data to the new node and anything in the transaction log will already be > distributed through the rest of the machines.. because according to the official documentation here <https://cwiki.apache.org/confluence/display/solr/Read+and+Write+Side+Fault+Tolerance>: (Write side fault tolerant -> recovery) If a leader goes down, it may have sent requests to some replicas and not > others. So when a new potential leader is identified, it runs a synch > process against the other replicas. If this is successful, everything > should be consistent, the leader registers as active, and normal actions > proceed I think there is a possibility that an update is not sent by the leader but is kept in the local disk and after it comes up again it can sync the non-sent data. Furthermore: Achieved Replication Factor > When using a replication factor greater than one, an update request may > succeed on the shard leader but fail on one or more of the replicas. For > instance, consider a collection with one shard and replication factor of > three. In this case, you have a shard leader and two additional replicas. > If an update request succeeds on the leader but fails on both replicas, for > whatever reason, the update request is still considered successful from the > perspective of the client. The replicas that missed the update will sync > with the leader when they recover. They have implemented this parameter called *min_rf* that you can use (client-side) to make sure that your update was replicated to at least one replica (e.g.: min_rf > 1). This is why my concern about moving storage around, because then I know when the shard leader comes back, solrcloud will run sync process for those documents that couldn't be sent to the replicas. Am I missing something or misunderstood the documentation ? Cheers ! On 5 July 2016 at 19:49, Davis, Daniel (NIH/NLM) [C] <daniel.da...@nih.gov> wrote: > Lorenzo, this probably comes late, but my systems guys just don't want to > give me real disk. Although RAID-5 or LVM on-top of JBOD may be better > than Amazon EBS, Amazon EBS is still much closer to real disk in terms of > IOPS and latency than NFS ;) I even ran a mini test (not an official > benchmark), and found the response time for random reads to be better. > > If you are a young/smallish company, this may be all in the cloud, but if > you are in a large organization like mine, you may also need to allow for > other architectures, such as a "virtual" Netapp in the cloud that > communicates with a physical Netapp on-premises, and the throughput/latency > of that. The most important thing is to actually measure the numbers you > are getting, both for search and for simply raw I/O, or to get your > systems/storage guys to measure those numbers. If you get your > systems/storage guys to just measure storage - you will want to care about > three things for indexing primarily: > > Sequential Write Throughput > Random Read Throughput > Random Read Response Time/Latency > > Hope this helps, > > Dan Davis, Systems/Applications Architect (Contractor), > Office of Computer and Communications Systems, > National Library of Medicine, NIH > > > > -----Original Message----- > From: Lorenzo Fundaró [mailto:lorenzo.fund...@dawandamail.com] > Sent: Tuesday, July 05, 2016 3:20 AM > To: solr-user@lucene.apache.org > Subject: Re: deploy solr on cloud providers > > Hi Shawn. Actually what im trying to find out is whether this is the best > approach for deploying solr in the cloud. I believe solrcloud solves a lot > of problems in terms of High Availability but when it comes to storage > there seems to be a limitation that can be workaround of course but it's a > bit cumbersome and i was wondering if there is a better option for this or > if im missing something with the way I'm doing it. I wonder if there are > some proved experience about how to solve the storage problem when > deploying in the cloud. Any advise or point to some enlightening > documentation will be appreciated. Thanks. > On Jul 4, 2016 18:27, "Shawn Heisey" <apa...@elyograg.org> wrote: > > > On 7/4/2016 10:18 AM, Lorenzo Fundaró wrote: > > > when deploying solr (in solrcloud mode) in the cloud one has to take > > > care of storage, and as far as I understand it can be a problem > > > because the storage should go wherever the node is created. If we > > > have for example, a node on EC2 with its own persistent disk, this > > > node happens to be the leader and at some point crashes but couldn't > > > make the replication of the data that has in the transaction log, > > > how do we do in that case ? Ideally the new node must use the > > > leftover data that the death node left, but this is a bit cumbersome > > > in my opinion. What are the best practices for this ? > > > > I can't make any sense of this. What is the *exact* problem you need > > to solve? The details can be very important. > > > > We might be dealing with this: > > > > http://people.apache.org/~hossman/#xyproblem > > > > Thanks, > > Shawn > > > > > -- -- Lorenzo Fundaro Backend Engineer E-Mail: lorenzo.fund...@dawandamail.com Fax + 49 - (0)30 - 25 76 08 52 Tel + 49 - (0)179 - 51 10 982 DaWanda GmbH Windscheidstraße 18 10627 Berlin Geschäftsführer: Claudia Helming und Niels Nüssler AG Charlottenburg HRB 104695 B http://www.dawanda.com