Hi all,
I'm Alberto Chiusole, an Italian computer science student and open-source fan. I'm currently performing a small research to expose to my fellow students the Hadoop project, and this is my first post in this ML.

I think I spotted I small mistake in the HDFS documentation regarding achieving HA with the Quorum Journal Manager [1], section "Hardware resources", paragraph "JournalNode machines": it's stated:
"""
The JournalNode daemon is relatively lightweight, so these daemons may reasonably be collocated on machines with other Hadoop daemons, for example NameNodes, the JobTracker, (...)
"""

Is "NameNodes" a typo and you meant "DataNode" instead? Aren't the JournalNodes meant to survive in case of a failure of the NameNodes? Why should I place a JournalNode on the same machine that contains the log I need to synchronize?


Moreover I have a quick question on the same topic: why do you suggest to place an odd numbers of machines as JournalNodes in order to increase the Fault Tolerance?


Regards,
Alberto Chiusole


[1]: https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#Hardware_resources

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to