Distcp is a backup tool, not a synchronization tool.
At best, you get a point-in-time snapshot of the DC1. For example, a period
schedule of distcp every night at 12am. But in case of total failure, you
lose everything from that point in time.


On Mon, May 7, 2018 at 12:30 AM, akshay naidu <[email protected]>
wrote:

> Hello Hadoopers,
> I am planning for a Disaster Recovery(DR) project mainly for *hadoop
> clusters*.
> Infrastructure is in a DataCenter in West say DC1 . I Have created a
> backup hadoop-spark cluster in DataCenter in east say DC2. With Distcp will
> keep DC2  synchd with DC1 . This will work as DR .
>
> But what I want is that in case when DC1 went down completely, the
> automatic failover should happen and without any or very very less downtime
> DC2 is live.
>
> I have configured *hadoop high availability* and *Automatic Failover *in
> hadoop cluster in DC1 and it works fine. But that won't help in case whole
> DC1 goes down.
>
> Is there a solution where I can keep two hadoop clusters running in
> parallel, completely synchd, in two different DataCenters. In case hadoop
> cluster in DC1 goes down , Automatic failover occurs to DC2.
>
> Any hint would be of great help, any feedback, positive or negative, will
> be a great help.
>
> Thanks .
>



-- 
A very happy Clouderan

Reply via email to