re: SolrCloud backup/restore: https://issues.apache.org/jira/browse/SOLR-5750

not committed yet, but getting attention.



On Thu, Jan 14, 2016 at 6:19 AM, Gian Maria Ricci - aka Alkampfer
<alkamp...@nablasoft.com> wrote:
> Actually there are situation where a restore is needed, suppose that someone 
> does some error and deletes all documents from a collection, or maybe deletes 
> a series of document, etc. I know that this is not likely to happen, but in 
> mission critical enterprise system, we always need a detailed procedure for 
> disaster recovering.
>
> For such scenario we need to plan the worst case, where everything is lost.
>
> With Master Slave is just a matter of recreating machines, reconfigure the 
> core, and restore a backup, and the game is done, with SolrCloud is not 
> really clear for me how can I backup / restore data. From what I've found in 
> the internet I need to backup every shard of the collection, and, if we need 
> to restore everything from a backup, we can recreate the collection and then 
> restore all the individual shards. I do not know if this is a supported 
> scenario / procedure, but theoretically it could work.
>
> --
> Gian Maria Ricci
> Cell: +39 320 0136949
>
>
>
> -----Original Message-----
> From: Alessandro Benedetti [mailto:abenede...@apache.org]
> Sent: giovedì 14 gennaio 2016 10:46
> To: solr-user@lucene.apache.org
> Subject: Re: Pro and cons of using Solr Cloud vs standard Master Slave Replica
>
> It's true that SolrCloud is adding some complexity.
> But few observations :
>
> SolrCloud has some disadvantages and c
> an't beat the easiness and simpleness
>> of
>> Master Slave Replica. So I can only encourage to keep Master Slave
>> Replica in future versions.
>
>
> I agree, it can happen situations when you have really simple and not 
> critical systems.
> Anyway old style replication is still used in SolrCloud, so I think it is 
> going to stay for a while ( until is replaced with something else) .
>
> To answer to Gian :
>
> One of the problem I've found is that I've not found a simple way to backup
>> the content of a collection to restore in situation of disaster recovery.
>> With simple master / slave scenario we can use the replication handler
>> to generate backups that can be easily used to restore content of a
>> core, while with SolrCloud is not clear how can we obtain a full
>> backup
>
>
> To be fair, Disaster recovery is when SolrCloud shines.
> If you lose random nodes across your collection, you simply need to fix them 
> and spin up again .
> The system will automatically restore the content to the last version availa 
> ble ( the tlog first and the  leader ( if the tlog is not enough) will help 
> the dead node to catch up .
> If you lose all the replicas for a shard and you lose the content in disk of 
> all this replicas ( index and tlog), SolrCloud can't help you.
> For this unlikely scenarios a backup is suggested.
> You could restore anyway the backup only to one node, and the replicas are 
> going to catch up .
>
> Probably is just a matter of backupping every shard with standard
>> replication handler and then restore each shard after recreating the
>> collection
>
>
> Definitely not, SolrCloud is there to avoid this manual stuff.
>
> Cheers
>
>
> On 14 January 2016 at 08:58, Gian Maria Ricci - aka Alkampfer < 
> alkamp...@nablasoft.com> wrote:
>
>> I agree that SolrCloud has not only advantages, I really understand
>> that it offers many more features, but it introduces some complexity.
>>
>> One of the problem I've found is that I've not found a simple way to
>> backup the content of a collection to restore in situation of disaste
> r
>> recovery. With simple master / slave scenario we can use the
>> replication handler to generate backups that can be easily used to
>> restore content of a core, while with SolrCloud is not clear how can we 
>> obtain a full backup.
>> Probably is just a matter of backupping every shard with standard
>> replication handler and then restore each shard after recreating the
>> collection, but I've not found (probably I need to search better)
>> official documentation on backup / restore procedures for SolrCloud.
>>
>> Thanks.
>>
>> --
>> Gian Maria Ricci
>> Cell: +39 320 0136949
>>
>>
>> -----Original Message-----
>> From: Bernd Fehling [mailto:bernd.fehl...@uni-bielefeld.de]
>> Sent: giovedì 14 gennaio 2016 08:22
>> To: solr-user@lucene.apache.org
>> Subject: Re: Pro and cons of using Solr Cloud vs standard Master Slave
>> Replica
>>
>> SolrCloud has some disadvantages and can't beat the easiness and
>> simpleness of Master Slave Replica. So I can only encourage to keep
>> Master Slave Replica in
> future versions.
>>
>> Bernd
>>
>> Am 13.01.2016 um 21:57 schrieb Jack Krupansky:
>> > The "Legacy Scaling and Distribution" section of the Solr Reference
>> > Guide also gives info elated to so-called master-slave mode:
>> > https://cwiki.apache.org/confluence/display/solr/Legacy+Scaling+and+
>> > Di
>> > stribution
>> >
>> > Also, although the old master-slave mode is still technically
>> > supported in the sense that the code and doc is still there, You
>> > won't be able to get the level of community support  here on the
>> > mailing list as you can get for SolrCloud.
>> >
>> > Unless you're simply trying to decide whether to leave an old legacy
>> > system as-is with the old distributed mode, nobody should be
>> > considered a fresh new distributed Solr deployment with anything
>> > other
>> than SolrCloud.
>> >
>> > (Hmmm... have any of the committers considered deprecating the old
>> > non-SolrCloud distributed mode features?)
>>
>> -1
>>
>> >
>> > -- Jack Krupansky
>> >
>> > On Wed, Jan 13, 2016 at 9:02 AM, Shiv
> aji Dutta
>> > <sdu...@hortonworks.com>
>> > wrote:
>> >
>> >> - SolrCloud uses zookeeper to manage HA
>> >>         - Zookeeper is a standard for all HA in Apache Hadoop
>> >> - You have collections which will manage your shards across nodes
>> >> - SolrJ Client is now fault tolerant with CloudSolrClient
>> >>
>> >> This is the way future direction of the product will go.
>> >>
>> >>
>> >>
>> >> On 1/13/16, 5:58 AM, "Gian Maria Ricci - aka Alkampfer"
>> >> <alkamp...@nablasoft.com> wrote:
>> >>
>> >>> Thanks.
>> >>>
>> >>> --
>> >>> Gian Maria Ricci
>> >>> Cell: +39 320 0136949
>> >>>
>> >>>
>> >>>
>> >>> -----Original Message-----
>> >>> From: Shawn Heisey [mailto:apa...@elyograg.org]
>> >>> Sent: lunedì 11 gennaio 2016 18:28
>> >>> To: solr-user@lucene.apache.org
>> >>> Subject: Re: Pro and cons of using Solr Cloud vs standard Master
>> >>> Slave Replica
>> >>>
>> >>> On 1/11/2016 4:28 AM, Gian Maria Ricci - aka Alkampfer wrote:
>> >>>> a customer need a comprehensive list of all pro and cons of using
>
>> >>>> standard Master Slave replica VS using Solr Cloud. I¹m interested
>> >>>> especially in query performance consideration, because in this
>> >>>> specific situation the rate of new documents is really slow, but
>> >>>> the amount of data is about 50 millions of document, and the
>> >>>> index size on disk for single core is about 30 GB.
>> >>>
>> >>> The primary advantage to SolrCloud is that SolrCloud handles most
>> >>> of the administrative and operational details for you automatically.
>> >>>
>> >>> SolrCloud is a little more complicated to set up initially,
>> >>> because you must worry about Zookeeper as well as Solr, but once
>> >>> it's properly set up, there is no single point of failure.
>> >>>
>> >>>> Such amount of data should be easily handled by a Master Slave
>> >>>> replica with a  single core replicated on a certain number of
>> >>>> slaves, but we need to evaluate also the option of SolrCloud,
>> >>>> especially for fault tolerance.
>> >>>>
>> >>>
>> >>> Once you're beyond in
> itial setup, fault tolerance with SolrCloud is
>> >>> much easier than master/slave replication.  Switching a slave to a
>> >>> master is possible, but the procedure is somewhat complicated.
>> >>> SolrCloud does not
>> >>> *have* masters, it is a true cluster.
>> >>>
>> >>> With master/slave replication, the master handles all indexing,
>> >>> and the finished index segments are copied to the slaves via HTTP,
>> >>> and the slaves simply need to open them.  SolrCloud does indexing
>> >>> on all shard replicas, nearly simultaneously.  Usually this is an
>> >>> advantage, not a disadvantage, but in heavy indexing situations
>> >>> master/slave replication
>> >>> *might* show better performance on the slaves.
>> >>>
>> >>> Thanks,
>> >>> Shawn
>> >>>
>> >>>
>> >>
>> >>
>> >
>>
>>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symm
> etry?"
>
> William Blake - Songs of Experience -1794 England

Reply via email to