re: SolrCloud backup/restore: https://issues.apache.org/jira/browse/SOLR-5750
not committed yet, but getting attention. On Thu, Jan 14, 2016 at 6:19 AM, Gian Maria Ricci - aka Alkampfer <alkamp...@nablasoft.com> wrote: > Actually there are situation where a restore is needed, suppose that someone > does some error and deletes all documents from a collection, or maybe deletes > a series of document, etc. I know that this is not likely to happen, but in > mission critical enterprise system, we always need a detailed procedure for > disaster recovering. > > For such scenario we need to plan the worst case, where everything is lost. > > With Master Slave is just a matter of recreating machines, reconfigure the > core, and restore a backup, and the game is done, with SolrCloud is not > really clear for me how can I backup / restore data. From what I've found in > the internet I need to backup every shard of the collection, and, if we need > to restore everything from a backup, we can recreate the collection and then > restore all the individual shards. I do not know if this is a supported > scenario / procedure, but theoretically it could work. > > -- > Gian Maria Ricci > Cell: +39 320 0136949 > > > > -----Original Message----- > From: Alessandro Benedetti [mailto:abenede...@apache.org] > Sent: giovedì 14 gennaio 2016 10:46 > To: solr-user@lucene.apache.org > Subject: Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica > > It's true that SolrCloud is adding some complexity. > But few observations : > > SolrCloud has some disadvantages and c > an't beat the easiness and simpleness >> of >> Master Slave Replica. So I can only encourage to keep Master Slave >> Replica in future versions. > > > I agree, it can happen situations when you have really simple and not > critical systems. > Anyway old style replication is still used in SolrCloud, so I think it is > going to stay for a while ( until is replaced with something else) . > > To answer to Gian : > > One of the problem I've found is that I've not found a simple way to backup >> the content of a collection to restore in situation of disaster recovery. >> With simple master / slave scenario we can use the replication handler >> to generate backups that can be easily used to restore content of a >> core, while with SolrCloud is not clear how can we obtain a full >> backup > > > To be fair, Disaster recovery is when SolrCloud shines. > If you lose random nodes across your collection, you simply need to fix them > and spin up again . > The system will automatically restore the content to the last version availa > ble ( the tlog first and the leader ( if the tlog is not enough) will help > the dead node to catch up . > If you lose all the replicas for a shard and you lose the content in disk of > all this replicas ( index and tlog), SolrCloud can't help you. > For this unlikely scenarios a backup is suggested. > You could restore anyway the backup only to one node, and the replicas are > going to catch up . > > Probably is just a matter of backupping every shard with standard >> replication handler and then restore each shard after recreating the >> collection > > > Definitely not, SolrCloud is there to avoid this manual stuff. > > Cheers > > > On 14 January 2016 at 08:58, Gian Maria Ricci - aka Alkampfer < > alkamp...@nablasoft.com> wrote: > >> I agree that SolrCloud has not only advantages, I really understand >> that it offers many more features, but it introduces some complexity. >> >> One of the problem I've found is that I've not found a simple way to >> backup the content of a collection to restore in situation of disaste > r >> recovery. With simple master / slave scenario we can use the >> replication handler to generate backups that can be easily used to >> restore content of a core, while with SolrCloud is not clear how can we >> obtain a full backup. >> Probably is just a matter of backupping every shard with standard >> replication handler and then restore each shard after recreating the >> collection, but I've not found (probably I need to search better) >> official documentation on backup / restore procedures for SolrCloud. >> >> Thanks. >> >> -- >> Gian Maria Ricci >> Cell: +39 320 0136949 >> >> >> -----Original Message----- >> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de] >> Sent: giovedì 14 gennaio 2016 08:22 >> To: solr-user@lucene.apache.org >> Subject: Re: Pro and cons of using Solr Cloud vs standard Master Slave >> Replica >> >> SolrCloud has some disadvantages and can't beat the easiness and >> simpleness of Master Slave Replica. So I can only encourage to keep >> Master Slave Replica in > future versions. >> >> Bernd >> >> Am 13.01.2016 um 21:57 schrieb Jack Krupansky: >> > The "Legacy Scaling and Distribution" section of the Solr Reference >> > Guide also gives info elated to so-called master-slave mode: >> > https://cwiki.apache.org/confluence/display/solr/Legacy+Scaling+and+ >> > Di >> > stribution >> > >> > Also, although the old master-slave mode is still technically >> > supported in the sense that the code and doc is still there, You >> > won't be able to get the level of community support here on the >> > mailing list as you can get for SolrCloud. >> > >> > Unless you're simply trying to decide whether to leave an old legacy >> > system as-is with the old distributed mode, nobody should be >> > considered a fresh new distributed Solr deployment with anything >> > other >> than SolrCloud. >> > >> > (Hmmm... have any of the committers considered deprecating the old >> > non-SolrCloud distributed mode features?) >> >> -1 >> >> > >> > -- Jack Krupansky >> > >> > On Wed, Jan 13, 2016 at 9:02 AM, Shiv > aji Dutta >> > <sdu...@hortonworks.com> >> > wrote: >> > >> >> - SolrCloud uses zookeeper to manage HA >> >> - Zookeeper is a standard for all HA in Apache Hadoop >> >> - You have collections which will manage your shards across nodes >> >> - SolrJ Client is now fault tolerant with CloudSolrClient >> >> >> >> This is the way future direction of the product will go. >> >> >> >> >> >> >> >> On 1/13/16, 5:58 AM, "Gian Maria Ricci - aka Alkampfer" >> >> <alkamp...@nablasoft.com> wrote: >> >> >> >>> Thanks. >> >>> >> >>> -- >> >>> Gian Maria Ricci >> >>> Cell: +39 320 0136949 >> >>> >> >>> >> >>> >> >>> -----Original Message----- >> >>> From: Shawn Heisey [mailto:apa...@elyograg.org] >> >>> Sent: lunedì 11 gennaio 2016 18:28 >> >>> To: solr-user@lucene.apache.org >> >>> Subject: Re: Pro and cons of using Solr Cloud vs standard Master >> >>> Slave Replica >> >>> >> >>> On 1/11/2016 4:28 AM, Gian Maria Ricci - aka Alkampfer wrote: >> >>>> a customer need a comprehensive list of all pro and cons of using > >> >>>> standard Master Slave replica VS using Solr Cloud. I¹m interested >> >>>> especially in query performance consideration, because in this >> >>>> specific situation the rate of new documents is really slow, but >> >>>> the amount of data is about 50 millions of document, and the >> >>>> index size on disk for single core is about 30 GB. >> >>> >> >>> The primary advantage to SolrCloud is that SolrCloud handles most >> >>> of the administrative and operational details for you automatically. >> >>> >> >>> SolrCloud is a little more complicated to set up initially, >> >>> because you must worry about Zookeeper as well as Solr, but once >> >>> it's properly set up, there is no single point of failure. >> >>> >> >>>> Such amount of data should be easily handled by a Master Slave >> >>>> replica with a single core replicated on a certain number of >> >>>> slaves, but we need to evaluate also the option of SolrCloud, >> >>>> especially for fault tolerance. >> >>>> >> >>> >> >>> Once you're beyond in > itial setup, fault tolerance with SolrCloud is >> >>> much easier than master/slave replication. Switching a slave to a >> >>> master is possible, but the procedure is somewhat complicated. >> >>> SolrCloud does not >> >>> *have* masters, it is a true cluster. >> >>> >> >>> With master/slave replication, the master handles all indexing, >> >>> and the finished index segments are copied to the slaves via HTTP, >> >>> and the slaves simply need to open them. SolrCloud does indexing >> >>> on all shard replicas, nearly simultaneously. Usually this is an >> >>> advantage, not a disadvantage, but in heavy indexing situations >> >>> master/slave replication >> >>> *might* show better performance on the slaves. >> >>> >> >>> Thanks, >> >>> Shawn >> >>> >> >>> >> >> >> >> >> > >> >> > > > -- > -------------------------- > > Benedetti Alessandro > Visiting card : http://about.me/alessandro_benedetti > > "Tyger, tyger burning bright > In the forests of the night, > What immortal hand or eye > Could frame thy fearful symm > etry?" > > William Blake - Songs of Experience -1794 England