@Shady, please see: https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html
-- Iain Wright This email message is confidential, intended only for the recipient(s) named above and may contain information that is privileged, exempt from disclosure under applicable law. If you are not the intended recipient, do not disclose or disseminate the message to anyone except the intended recipient. If you have received this message in error, or are not the named recipient(s), please immediately notify the sender by return email, and delete all copies of this message. On Wed, Aug 24, 2016 at 2:17 AM, Shady Xu <[email protected]> wrote: > Anyone any idea? > > 2016-08-16 10:27 GMT+08:00 Shady Xu <[email protected]>: > >> Thanks Wei-Chiu and Sunil, I have read the docs you mentioned before >> starting. The specific problem now is that the DataNodes of the source >> cluster report their local ip instead of the public one, which cannot be >> accessed from the NodeManagers of the destination cluster. Seems the >> solution is to set the `dfs.datanode.dns.interface` property but >> unfortunately it doesn't work. >> >> 2016-08-15 22:06 GMT+08:00 Sunil Govind <[email protected]>: >> >>> Hi >>> >>> I think you can also refer below link too. >>> http://aajisaka.github.io/hadoop-project/hadoop-distcp/DistCp.html >>> >>> Thanks >>> Sunil >>> >>> On Mon, Aug 15, 2016 at 7:26 PM Wei-Chiu Chuang <[email protected]> >>> wrote: >>> >>>> Hello, >>>> if I understand your question correctly, you are actually building a >>>> multi-home Hadoop, correct? >>>> Multi-homed Hadoop cluster can be tricky to set up, to the extend that >>>> Cloudera does not recommend it. I've not set up a multihome Hadoop cluster >>>> before, but I think you have to make sure the reverse resolution works for >>>> the IP addresses. >>>> >>>> https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/ha >>>> doop-hdfs/HdfsMultihoming.html >>>> >>>> >>>> On Mon, Aug 15, 2016 at 1:06 AM, Shady Xu <[email protected]> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Recently I tried to use distcp to copy data across two clusters which >>>>> are not in the same local network. Fortunately, the nodes of the source >>>>> cluster each has an extra interface and ip which can be accessed from the >>>>> destination cluster. But during the process of distcp, the map tasks >>>>> always >>>>> used the local ip of the source cluster nodes which they cannot reach. >>>>> >>>>> I tried changing the property 'dfs.datanode.dns.interface' to the one >>>>> I want, and I tried changing the property ' >>>>> dfs.datanode.use.datanode.hostname' to true too. Nothing works. >>>>> >>>>> Does hadoop now support this or do I miss something? >>>>> >>>> >>>> >> >
