Here is nice explanation [1] what your options are. HDFS does not
support replication between clusters [2]. If you are using HBase, things
are better.

> In case hadoop cluster in DC1 goes down , Automatic failover occurs to
> DC2.

There are setups with DRBD, but if you can afford small loss between
distcp runs in case of disaster, haproxy (for client facing connections)
with distcp is the simplest approach.

[1] 
https://community.hortonworks.com/questions/29645/hdfs-replication-for-dr.html
[2] https://issues.apache.org/jira/browse/HDFS-5442

Best,
Sanel

akshay naidu <[email protected]> writes:
> Hello Hadoopers,
> I am planning for a Disaster Recovery(DR) project mainly for *hadoop
> clusters*.
> Infrastructure is in a DataCenter in West say DC1 . I Have created a backup
> hadoop-spark cluster in DataCenter in east say DC2. With Distcp will keep
> DC2  synchd with DC1 . This will work as DR .
>
> But what I want is that in case when DC1 went down completely, the
> automatic failover should happen and without any or very very less downtime
> DC2 is live.
>
> I have configured *hadoop high availability* and *Automatic Failover *in
> hadoop cluster in DC1 and it works fine. But that won't help in case whole
> DC1 goes down.
>
> Is there a solution where I can keep two hadoop clusters running in
> parallel, completely synchd, in two different DataCenters. In case hadoop
> cluster in DC1 goes down , Automatic failover occurs to DC2.
>
> Any hint would be of great help, any feedback, positive or negative, will
> be a great help.
>
> Thanks .

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to