[jira] [Commented] (HADOOP-7359) Pluggable interface for cluster membership

Steve Loughran (JIRA) Mon, 04 Jul 2011 04:52:50 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13059399#comment-13059399
 ]


Steve Loughran commented on HADOOP-7359:
----------------------------------------

Some more comments after looking at the code

* It'd be good to split cleanup (imports, better iteration) from the cluster 
changes, and put the cleanup in first.
* I'm not sure about logging excludes data at info level; it seems 
over-verbose. If it does go in, it should link to a wiki page on ExcludesFile 
to say "don't panic, this is optional"
* Following on with the new API model, I think the clustering should be a 
class, not an interface
* There's an assumption in the code that get[Excluded]Hosts() never fails; 
probably an implicit one that it's fast. It'd make sense for the calls to be 
able to throw IOEs, as they could be triggering live directory lookups, and if 
bounded execution time is a requirement, that should be in the javadocs "must 
return in under 100 milliseconds"
* I wouldn't mark the various AdminOperationsProtocols as stable, as they are 
clearly moving around. 

Related to this, I could imagine another JIRA issue of a kill -something that 
would trigger a refresh on any/all registered services in the VM. That way even 
if you don't have a refresh rate, you can manually trigger a reload. 

> Pluggable interface for cluster membership
> ------------------------------------------
>
>                 Key: HADOOP-7359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7359
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Travis Crawford
>         Attachments: HADOOP-7359.diff
>
>
> Currently Hadoop uses local files to determine cluster membership. With HDFS 
> for example, dfs.hosts and dfs.hosts.exclude are used.
> To enable tighter integrations cluster membership should be an interface, 
> with the current file-based functionality provided as the default 
> implementation. The common case would be no functional change, however, sites 
> could plug an alternative implementation in, such as pulling the machine 
> lists from a machine database.
> DETAILS:
> Two machine lists, includes and excludes, are used to define cluster 
> membership and state. HostsFileReader currently handles reading these lists 
> from files, who's names are passed in by FSNamesystem for HDFS and JobTracker 
> for MR.
> The proposed change is adding a HostsReader interface to common, and changing 
> HostsFileReader to an abstract class that functions the same as today.
> Two new classes, DFSHostsFileReader and MRHostsFileReader, extend 
> HostsFileReader and simply pass the appropriate file names in. These new 
> classes are needed because config key names live outside common.
> Two new conf keys, defaulting to the file-based readers, would be added to 
> choose a different hosts reader: dfs.namenode.hosts.reader.class 
> mapreduce.jobtracker.hosts.reader.class
> Comments/suggestions? I have most of this written already but would love some 
> feedback on the general idea before posting the diff.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-7359) Pluggable interface for cluster membership

Reply via email to