[jira] [Commented] (HADOOP-7359) Pluggable interface for cluster membership

Travis Crawford (JIRA) Tue, 07 Jun 2011 21:54:48 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13045778#comment-13045778
 ]


Travis Crawford commented on HADOOP-7359:
-----------------------------------------

Would anyone object to allowing the HostsReader to trigger refreshNodes? That 
would let Hadoop scan for or be notified of cluster membership changes and 
automagically do the Right Thing.

DETAILS

Taking a step back, this change would be the most useful if your authoritative 
source for machine roles is stored Somewhere Else and you want Hadoop to 
integrate. The posted diff simply lets you pull the lists of included/excluded 
hosts from such a source, but does not activate the new lists - you still need 
to refreshNodes.

Imagine you update the authoritative source with new/removed machines and want 
Hadoop to learn about the change (ZK watch, polling, etc.). It would be very 
handy for cluster membership & state changes to propagate without manual 
intervention as is needed today. Permitting HostsReader to call refreshNodes 
would accomplish this goal.

PROPOSED IMPLEMENTATION

Introduce a "Refreshable" interface that both FSNamesystem and JobTracker 
implement, that only defines a refreshNodes method. HostsReader would have an 
initialize method that takes a Refreshable and users could choose to call 
refreshNodes.

The current file-based cluster membership would continue to work exactly as it 
does today.

Sort of a bigger change, but potentially very useful at larger sites. If 
there's general agreement this would be useful I'll post a diff. If not, I 
still think there's value in this change as it means no more copy/pasting lists 
of machines from the machine database :)

> Pluggable interface for cluster membership
> ------------------------------------------
>
>                 Key: HADOOP-7359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7359
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Travis Crawford
>         Attachments: HADOOP-7359.diff
>
>
> Currently Hadoop uses local files to determine cluster membership. With HDFS 
> for example, dfs.hosts and dfs.hosts.exclude are used.
> To enable tighter integrations cluster membership should be an interface, 
> with the current file-based functionality provided as the default 
> implementation. The common case would be no functional change, however, sites 
> could plug an alternative implementation in, such as pulling the machine 
> lists from a machine database.
> DETAILS:
> Two machine lists, includes and excludes, are used to define cluster 
> membership and state. HostsFileReader currently handles reading these lists 
> from files, who's names are passed in by FSNamesystem for HDFS and JobTracker 
> for MR.
> The proposed change is adding a HostsReader interface to common, and changing 
> HostsFileReader to an abstract class that functions the same as today.
> Two new classes, DFSHostsFileReader and MRHostsFileReader, extend 
> HostsFileReader and simply pass the appropriate file names in. These new 
> classes are needed because config key names live outside common.
> Two new conf keys, defaulting to the file-based readers, would be added to 
> choose a different hosts reader: dfs.namenode.hosts.reader.class 
> mapreduce.jobtracker.hosts.reader.class
> Comments/suggestions? I have most of this written already but would love some 
> feedback on the general idea before posting the diff.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7359) Pluggable interface for cluster membership

Reply via email to