[ 
https://issues.apache.org/jira/browse/HADOOP-12973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15229636#comment-15229636
 ] 

Colin Patrick McCabe commented on HADOOP-12973:
-----------------------------------------------

Thanks, [~eclark].  Looks cool.

{code}
-  class DURefreshThread implements Runnable {
-    
+  static class DURefreshThread implements Runnable {
+
+    private final DU du;
+
+    DURefreshThread(DU du) {
+      this.du = du;
+    }
+
{code}
Hmm.  There's no benefit to changing this class from non-static to static, if 
it needs to keep a reference to the parent DU object anyway, right?

Style nit: We pretty much always put {{\@Override}} annotations on the 
preceding line from functions, not on the same line.

I can see the intention behind splitting DU.java into DU.java and 
WindowsDU.java.  However, I'm not sure that this is a good idea, for two 
reasons:
* Any MapReduce code that is using DU directly will now break on Windows, 
unless it is modified to explicitly use WindowsDU when appropriate
* There is a non-obvious coupling between DU and WindowsDU, since one is a 
subclass of the other.  It is easy to imagine accidentally breaking WindowsDU 
with some change to DU.  It's not intuitively obvious why the Windows DU should 
be a subclass of the Linux DU.  This is a good example where we should "prefer 
composition to inheritance" as the gang of four wrote.  Or just leave the 
simple "if" statement-- it's a very small amount of code, and fairly clear what 
is going on.

{{GetSpaceUsedFactory}}: can we make this a builder instead?  That way, it 
would be much easier to add new configuration parameters later on.  One pattern 
we've used in the past is to have something like this:

{code}
GetSpaceUsedFactory(File file);
GetSpaceUsedFactory GetSpaceUsedFactory#setConf(Configuration conf);
GetSpaceUsedFactory GetSpaceUsedFactory#setInitialUsed(long initialUsed);
{code}
then for {{GetSpaceUsed}} instances, they only need to have one visible 
constructor:
{code}
GetSpaceUsed(GetSpaceUsedFactory factory);
{code}
Of course we want to keep any existing public constructors in {{DU.java}} 
visible, but in the future, it would be easier just to do everything via the 
factory rather than having a zillion constructors for every possible 
combination of arguments.  And new {{GetSpaceUsed}} implementations don't need 
to add the legacy constructors.

Do we need both {{decDfsUsed}} and {{incDfsUsed}}?  Can't we just add a 
negative?  Or is that too unintuitive?

Is {{GetSpaceUsed#start}} really necessary?  It seems like GetSpaceUsed should 
manage any worker threads internally (and indeed, the DF implementation 
shouldn't need worker threads.)  This seems like an implementation detail, not 
something that should be in the interface.  I wonder if making GetSpaceUsed 
extend Closeable would be more intuitive than having a shutdown method?

> make DU pluggable
> -----------------
>
>                 Key: HADOOP-12973
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12973
>             Project: Hadoop Common
>          Issue Type: Sub-task
>            Reporter: Elliott Clark
>            Assignee: Elliott Clark
>         Attachments: HADOOP-12973v0.patch, HADOOP-12973v1.patch, 
> HADOOP-12973v2.patch, HADOOP-12973v3.patch, HADOOP-12973v5.patch, 
> HADOOP-12973v6.patch, HADOOP-12973v7.patch, HADOOP-12973v8.patch, 
> HADOOP-12973v9.patch
>
>
> If people are concerned about replacing the call to DU. Then an easy first 
> step is to make it pluggable. Then it's possible to replace it with something 
> while leaving the default alone.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to